Mathematical Knowledge Management Across Formal Libraries
Mathematisches Wissensmanagement Über Formale Bibliotheksgrenzen Hinaus
Der Technischen Fakultät der Friedrich-Alexander-Universität Erlangen-Nürnberg zur Erlangung des Doktorgrades
Doktor-Ingenieur
vorgelegt von
Dennis Müller
aus
Garmisch-Partenkirchen
Als Dissertation genehmigt von der Technischen Fakultät der Friedrich-Alexander-Universität Erlangen-Nürnberg
|
Tag der mündlichen Prüfung:
|
18.10.2019
|
|
|
|
|
Vorsitzender der Promotionsorgans:
|
Prof. Dr.-Ing. Andreas Paul Fröba
|
|
Gutachter:
|
Prof. Dr. Michael Kohlhase
|
|
|
assoz. Prof. Dr. Claudio Sacerdoti Coen
|
This dissertation is concerned with the problem of integrating formal libraries: There is a plurality of theorem prover systems and related software with different strengths and weaknesses, that are fundamentally incompatible. This is due to different input languages, IDEs, library management facilities etc., but also due to fundamental theoretical aspects; primarily the reliance on different logical foundations, such as set theories, type theories, and additional primitive features (e.g. subtyping, specific module systems,...).
This thesis describes the approach towards solving this problem pursued at our research group. Specifically, it is divided into three main parts:
1.
2.
3.
(a)
(b)
(c)
(d)
Diese Doktorarbeit widmet sich dem Integrationsproblem formaler Bibliotheken: Es gibt eine Vielzahl an Theorembeweisern und verwandter formaler Software mit unterschiedlichen Stärken und Schwächen, die grundlegend inkompatibel sind. Dies liegt sowohl an unterschiedlichen Eingabesprachen, IDEs, Bibliotheksmanagement etc., als auch grundlegenden theoretischen Aspekten; primär die Verwendung unterschiedlicher logischer Fundamente, wie Mengentheorien, Typentheorien, und zusätzlichen primitiven Features (z.B. Subtyping, spezifische Modulsysteme,...).
Die Arbeit beschreibt den von unserer Forschungsgruppe verfolgten Ansatz, dieses Problem anzugehen. Konkret ist sie in drei zentrale Abschnitte geteilt:
1.
2.
3.
(a)
(b)
(c)
(d)
I am deeply indebted to the following people, who all contributed in some way to my success in – and/or well-being during – the writing of this thesis:
First and foremost my parents, for their unconditional support in all my endeavours, no matter how seemingly misguided.
My wonderful girlfriend Tina, for her unrestrained love and affection, and for happily playing Tropico next to me when I need to get work done while being with her.
My band mates in Vroudenspil, for providing me with a desperately needed outlet outside of my profession, besides being genuinely lovely friends to have. # ShamelessPlug
Many researchers I had the pleasure to meet, and who any inspired, taught, advised, humoured, supported, entertained, just generally have been extremely nice to me in various ways; including but not limited to: Jacques Carette, Merlin Carl, Claudio Sacerdoti Coen, Paul-Olivier Dehaye, Bill Farmer, Michael Junk, Cezary Kaliszyk, Peter Koepke, Andrea Kohlhase, Thomas Koprucki, Markus Pfeiffer, Natarajan Shankar, Nicolas Thiéry, Makarius Wenzel, ...
My colleagues at KWARC, for contributing to establishing a productive, and yet unusually comfortable and fun work environment: Katja Berčič, Jonas Betzendahl, Deyan Ginev, Mihnea Iancu, Constantin Jucovschi, Theresa Pollinger, Max Rapp, Frederik Schaefer, Tom Wiesing.
In particular Florian Rabe, who acted as a second advisor, especially in technical matters, and helped me literally every step of the way.
And last but not least Michael Kohlhase, for being an exceptionally passionate, caring and inspiring advisor, and for avoiding without exception all negative tropes about PhD advisors that make up the central plots of countless horror stories that traumatized post grads tell each other huddled around bonfires made from th revisions of (ultimately failed) thesis drafts.
“You don’t have to have a dream. Americans on talent shows always talk about their dreams. Fine, if you have something you’ve always wanted to do, dreamed of, like, in your heart, go for it. After all, it’s something to do with your time, chasing a dream. And if it’s a big enough one, it’ll take you most of your life to achieve, so by the time you get to it and are staring into the abyss of the meaninglessness of your achievement you’ll be almost dead, so it won’t matter.
I never really had one of these dreams, and so I advocate passionate dedication to the pursuit of short-term goals. Be micro-ambitious. Put your head down and work with pride on whatever is in front of you. You never know where you might end up. Just be aware, the next worthy pursuit will probably appear in your periphery, which is why you should be careful of long-term dreams. If you focus too far in front of you, you won’t see the shiny thing out the corner of your eye.”
“Don’t seek happiness. Happiness is like an orgasm: If you think about it too much it goes away. Keep busy and aim to make someone else happy, and you might find you get some as a side effect.”
“Be hard on your opinions. A famous bon mot asserts that opinions are like arseholes in that everyone has one. There is great wisdom in this, but I would add that opinions differ significantly from arseholes in that yours should be constantly and thoroughly examined.”
“Remember it’s all luck. You are lucky to be here. You are incalculably lucky to be born, and incredibly lucky to be brought up by a nice family that helped you get educated and encouraged you to go to uni.
Or, if you were born into a horrible family, that’s unlucky and you have my sympathy, but you are still lucky. Lucky that you happen to be made of the sort of DNA that went on to make the sort of brain which, when placed in a horrible childhood environment, would make decisions that meant you ended up eventually graduating uni. Well done, you, for dragging yourself up by your shoelaces. But you were lucky. You didn’t create the bit of you that dragged you up. They’re not even your shoelaces.
I suppose I worked hard to achieve whatever dubious achievements I’ve achieved, but I didn’t make the bit of me that works hard, any more than I made the bit of me that ate too many burgers instead of attending lectures when I was [at uni].
Understanding that you can’t truly take credit for your successes, nor truly blame others for their failures, will humble you and make you more compassionate.
Empathy is intuitive, but it is also something you can work on intellectually.”
Contents
3.1
LATIN25
6
LFX69
13
Conclusion147
13.1.3
Search149
14
Alignments157
15.3
Example180
16.1.2
Search187
Part I
Introduction
Chapter 1
Motivation
In the last decades, the formalization of mathematical knowledge (and the verification and automation of formal proofs) has become of ever increasing interest. Formal methods nowadays are not just used by computer scientists to verify software and hardware as well as in program synthesis, but – due to problems such as Kepler’s conjecture [Hal+15a], the classification theorem for finite simple groups [Sol95], etc. – are also becoming increasingly popular among mathematicians in general.
By now, there is a vast plurality of formal systems and corresponding libraries to choose from. However, almost all of these are non-interoperable, because they are based on differing, mutually incompatible logical foundations (e.g. set theories, higher-order logic, variants of type theory etc.), library formats, library structures, and much work is spent developing the respective basic libraries in each system.
Moreover, since a library in one such system is not reusable in another system, developers are forced to spend an enormous amount of time and energy developing basic library organization features such as distribution, browsing/search or change management for each library format; all of which binds resources that could be used to improve core functionality and library contents instead.
One reason for the incompatibility is the widespread usage of the homogeneous method, which fixes some logical foundation with all primitive notions (e.g. types, axioms, inference rules) and uses conservative extensions of this foundation to allow for modeling some specific domain knowledge. While homogeneous reasoning is conveniently implementable and verifiable, it implies that a lot of work is needed just to model the basic domains of discourse necessary for mathematics, such as real numbers, algebraic theories, etc. Moreover, the resulting formalizations are actually less valuable to mathematicians, since they are intimately dependent on (and framed in terms of) the underlying logic, making it virtually impossible to move the results between different foundations or to abstract from them.
In contrast, mathematical practice favors the heterogeneous method, as exemplified by the works of Bourbaki [Bou64]. In heterogeneous reasoning, theories are used to introduce new primitive notions and the truth of some proposition is relative to a theory (as opposed to absolute with respect to some foundation). This method is closely related to the “little theories” paradigm [FGT92a] which prefers to state each theorem in the weakest possible theory, thus optimizing reusability of mathematical results. Even though in theory, all of mathematics can be reduced to first principles – most prominently first-order logic with (some extension of) ZFC – it is usually carried out in a highly abstract setting that hides the foundation, often rendering it irrelevant and thus allowing it to be ignored.
Correspondingly, there is an inconvenient discrepancy between the currently existing formal libraries and the way (informal) mathematics is usually done in practice. Furthermore, the available formal knowledge is distributed among dozens of different mutually incompatible formal library systems with considerable overlap between them.
Additionally, [Car+19] recently introduced the five aspects of big math systems – aspects of doing mathematics that can and should be supported by software, collectively referred to as tetrapod, as in Figure 1.1. These aspects are
1.
2.
3.
4.
A
T
E
X and other text formating software with strong support for mathematical expressions, and finally 5.
Most software systems focus on one of those tetrapod aspects, few on two, and none strongly support all of them. That is despite the fact that all five of these aspects are intrinsic and important parts of doing real-world mathematics, and it is correspondingly vital to support all of them in an interconnected workflow.
What we should consequently aim for is a universal archiving solution for all formal knowledge, that
1.
2.
3.
4.
The aim of this thesis is to describe our approach to achieving such an archiving solution by integrating the libraries of existing software systems, focusing on the inference, computation, tabulation and organization aspects of the tetrapod, and to demonstrate its feasibility.
Chapter 2
State of the Art
The attempt to systematically formalize both mathematical knowledge and its semantics goes back at least to the seminal work by Bertrand Russell and Albert North Whitehead [WR13]. In the 1950s and 1960s, computer systems were added to the tool chest for this endeavour, shifting the focus to designing foundations that combine both machine-friendliness and human readability. This enabled automated theorem proving, thanks to ideas going back to Allen Newell, Herbert A. Simon, and Martin Davis. This has been most succesful for first-order logic and related systems. For more expressive languages, the verification of human-written formal proofs has been the more successful approach, going back to John McCarthy, Nicolaas Govert de Bruijn, Robin Milner, and Per Martin-Löf. Modern proof assistants usually combine both approaches, generally verifying user input interactively and employing automation for routine proof steps whenever possible.
Since developing sophisticated proof systems requires both a lot of practical work and a high level of theoretical knowledge, most popular proof assistants have invested a large amount of time and energy into building their systems. Consequently, formalization within one such system pays off mostly at large scales, calling for a community effort to extend the corresponding formal libraries. In practice, however, the availability of many different and mutually incompatible systems – each with their own advantages and disadvantages – has instead driven towards ever increasing specialization within the formal mathematics community, resulting in lots of different libraries with considerable (and from a heterogeneous point of view unnecessary) overlap of actual content.
A more extensive summary of the state of the art and the scientific context of this work can be found in [KR16b].
2.1Foundations
OverviewThe notion of a foundation goes back to the foundational crisis of mathematics at the beginning of the 20th century. Resulting from a general confusion regarding the ontological status of informally defined objects like infinitesimals in real analysis, non-euclidean geometries and sets as objects of study in their own right, as well as debates over the validity of certain non-constructive proof techniques of ever growing abstraction and intricacy, the idea developed to find a formal, logical basis that fixes both an ontology as well as an unambiguous notion of what constitutes a valid proof.
By now, it is generally accepted within the mathematical community, that the answer to this problem is the combination of first-order logic and some system of set theory, usually considered as (possibly extensions of) Zermelo-Fraenkel set theory with the axiom of choice (ZFC). However, already going back to Principia Mathematica by Russell and Whitehead (which can be seen as an early variant of type theory), alternative foundations have been around.
There’s an inherent trade-off in every such foundation with respect to their complexity and expressiveness. Theoretically, a foundation should be simple with very few primitive notions, to make reasoning about the foundation more convenient and establish trust in its consistency. On the other hand, since a foundation should ultimately establish a framework for all of mathematics, it should be as expressive as possible to allow mathematicians to talk about all of the desired objects and realms in terms of the foundation. This trade-off naturally lead to a large diversity of different foundations, which nowadays are used in different proof assistants.
All of these systems fix one specific foundation as a basis for their specification language, usually variants of either constructive type theory, higher-order logic or (as implicitely used in mathematics) first-order set theory. The constructive type theories are mostly based on Martin-Löf type theory [ML74] or the calculus of constructions [CH88] and make use of the Curry-Howard correspondence [CF58; How80] to treat propositions as types (and proofs as -terms). Systems include Nuprl [Con+86], Agda [Nor05], Coq [Tea03], and Matita [Asp+06a]. The second group of systems go back to Church’s higher-order logic [Chu40] and include HOL4 [HOL4], ProofPower [Art], Isabelle/HOL [NPW02], and HOL Light [Har96].
Since type theories and higher-order logics are more conveniently machine implementable, systems using set theories are noticably rarer. These are e.g. Mizar [Miz], Isabelle/ZF [PC93], and Metamath [MeMa].
The foundation of the PVS system (see Section 10.1) includes a variant of higher-order logic, but has been specifically designed to have a set theoretic semantics. The IMPS system [FGT93] is based on a variant of higher-order logic with partial functions. The foundation of ACL2 [KMM00] is an untyped language based on Lisp.
Heterogeneous ReasoningAs mentioned in the previous section, even though all of mathematics is assumed to be reducible to some foundation, the way mathematics is usually practiced is according to the heterogeneous method, in which all foundational aspects are “hidden” and left implicit, unless necessary in the respective context. One major advantage of this method, in which theories are used to introduce new primitive notions, is that (in connection with the little theories approach) it allows for reusing mathematical results and moving them along theory morphisms (i.e. truth-preserving maps) between theories. This approach has been applied successfully in software engineering and algebraic specification, where formal module systems are used to build large theories out of little ones, e.g., in SML [Mil+97] and ASL [SW83].
To accomodate for this more convenient style of reasoning, most formal systems have specific features that introduce some form of heterogeneity either explicitely or implicitely. Explicit features include e.g. “locales” in Isabelle, “parametric theories” in PVS, “modules” in Coq, or “structures” in Mizar. The IMPS system [FGT93] is somewhat unique in so far, as it was designed specifically with heterogeneous reasoning in mind.
To use the heterogeneous method implicitly, the user introduces a new formal construct (such as real numbers) by defining them in terms of existing ones (e.g. the usual construction via equivalence classes on Cauchy sequences) and proving the actually relevant (i.e. construction independent) theorems about them (e.g. field axioms, topological completeness). Finally, the user can rely on the proven properties alone, without ever needing to expand the concrete definition originally implemented, which in the process is rendered irrelevant for all practical purposes.
To enable implicit heterogeneity, the system has to provide corresponding definition principles. Examples include type definitions in the HOL systems, provably terminating functions in Coq or Isabelle/HOL, or provably well-defined indirect definitions in Mizar. However, it should be noted, that such implicit heterogeneous definitions are usually internally expanded into conservative extensions of the foundation. Alternatively, several data types provide an additional way to use heterogeneity implicitely. These include (Co)inductive types and record types e.g, as done by Mizar structures and with Coq records in the Mathematical Components project [MC]. This has the additional advantage, that it allows for computation within the foundation.
The Incompatibility ProblemThe homogeneous method (in combination with implicit heterogeneity) has an obvious advantage: Since the foundation is fixed and can not be extended by new axioms, the trusted code base remains fixed as well. Furthermore, is allows integrating computational methods e.g. to reason about equality, without having to tediously instantiate and apply the corresponding axioms. However, it has the major disadvantage of making reuse of formalized knowledge difficult, if not impossible. Since the techniques for implicit heterogeneity are elaborated on the basis of the foundation, the actual heterogeneous structure of some fragment of implemented knowledge is difficult to identify. Furthermore, since different systems offer different techniques to introduce implicit heterogeneity, the construction method used in one system can often not be easily translated into another system. Their elaborated implementations themselves are again framed in terms of the foundation (or an intrinsic part thereof), and are consequently useless for a system based on a differing foundation.
Even though a fixed foundation is therefore reasonable for an individual formal system, if we envision a universal library of mathematics – a major goal of formal mathematics going back at least to the QED project and manifesto [Qed] of 1994 – this is the wrong approach. Furthermore, different areas of mathematics favor different foundations (such as category theory, specifically in algebra), as do different communities dealing with formalizing mathematics, whether due to technical reasons or simply familiarity. Thus, formal libraries would profit from foundational pluralism, i.e., the ability to support multiple foundations in a single universal library.
2.2Formal Libraries
OverviewUsually, implemented formal systems have some notion of a library, a collection of formalizations (usually just the corresponding source files) that can be handled in specific ways – imported and used when creating new content, collectively exported, etc. Most systems are distributed with some base library providing the most useful and ubiquitous settings of interest, such as Booleans or number spaces, and often maintain additional, community-created libraries with more advanced contents.
The Isabelle and the Mizar groups maintain one centralized library each – the “Archive of Formal Proofs” [AFP] and the “Mizar Mathematical Library” [MizLib], respectively. The Coq group relies on a package manager on top of distributed Git repositories instead of a central archive. These libraries contain individual formalizations with relatively few interdependencies. Other libraries are generated and maintained by communities apart from the developers of the system, however, these are still often valuable in their own right, such as Tom Hales’s formalizations in HOL Light for the Kepler conjecture [Hal+15a] and Georges Gonthier’s work in Coq for the recently proved Feit-Thompson theorem [Gon+13]. John Harrison’s formalizations in his HOL Light system [Har96] and the NASA PVS library [PVS] have a similar flavor although they were not motivated by a single theorem but by a specific application domain. The latter is one of the biggest decentralized libraries, whose maintenance is disconnected from that of the system.
As mentioned, most of these systems are based on the homogeneous method. However, there are some libraries that are intrinsically heterogeneous, such as the IMPS library [FGT], the LATIN logic library [Cod+11] developed at KWARC (see Section 3.1) and the TPTP library [Sut09] of challenge problems for automated theorem provers. Unfortunately, none of these enjoy the level of interpretation, deduction, and computation support developed for individual fixed foundations.
The OpenTheory format [Hur09] offers some support for heterogeneity in order to allow moving theorems between systems for higher-order logic (specifically HOL Light, HOL4, and ProofPower). It provides a generic representation format for proofs within higher-order logic that makes the dependency relation (i.e., the operators and theorems used by a theorem) explicit. The OpenTheory library comprises several theories that have been obtained by manually refactoring exports from HOL systems.
Library IntegrationThere are two problems concerning library integration. The first is, given a single library, refactoring its contents to increase modularity. This results in a “more heterogeneous” set of theories, thus making it easier to reuse and blowing up the corresponding theory graph to make shared or inherited theories and results more visible and explicit. An attempt at giving a formal calculus for theory refactoring has recently been published [AH15].
The second, and for this thesis more relevant, problem is to integrate two or more given libraries with each other, so that knowledge formalized in one of them can be reused in, translated to or identified with contents of the others. In the best case, two libraries might even be merged into a single library with ideally no redundant content.
No strong tool support is available for any of these facets. The state-of-the-art for refactoring a single library is manual ad hoc work by experts, maybe supported by simple search tools (often text-based). Also, the widespread use of the homogeneous method makes integrating and merging libraries in different systems extremely difficult, since usually basic concepts in one foundation cannot be directly translated to corresponding concepts in the other [KRSC11].
This is despite the large need for more integrated and easily reusable large libraries. For example, in Tom Hales’s Flyspeck project [Hal+15a], his proof of the Kepler conjecture is formalized in HOL Light. But it relies on results achieved using Isabelle’s reflection mechanism, which cannot be easily recreated in HOL Light. And this is an integration problem between two tools based on the same root logic.
Library TranslationsThere are two ways to take on library integration. Firstly, one can try to translate the contents of one formal system directly into another system; I will call this a library bridge. This requires intimate knowledge of both systems and their respective foundations used, and is necessarily somewhat ad-hoc. A small number of library bridges have been realized, typically in special situations. [KW10] translates from HOL Light [Har96] to Coq [Tea03] and [OS06a] to Isabelle/HOL. Both translations benefit from the well-developed HOL Light export and the simplicity of the HOL Light foundation. [KS10] translates from Isabelle/HOL [NPW02] to Isabelle/ZF [PC93]. Here import and export are aided by the use of a logical framework to represent the logics. The Coq library has been imported into Matita [Asp+06a] once, aided by the fact that both use very similar foundations. The OpenTheory format [Hur09] facilitates sharing between HOL-based systems but has not been used extensively.
The second way is to use a more general logical framework (see Section 2.3) which provides some way to specify the respective foundations, and integrate the libraries under consideration directly into that framework. Then the framework can serve as a uniform intermediate data structure, via which other systems can import the integrated libraries. This approach will be extensively described in this thesis, using the logical framework LF [HHP93a] and making the libraries available to knowledge management services. Another example is the Dedukti system [BCH12], which imports, e.g., Coq and HOL Light into a similar logical framework, namely LF extended with rewriting.
Dedukti underlies the recent Logipedia project [DT], aiming to build an encyclopedia of formal proofs in simple type theory (originally developed in Matita), which can be exported to other systems.
Again, the prevalence of the homogeneous method constitutes a major problem here. Even with implicit heterogeneity, the fact that e.g. theories such as the real numbers are still modeled as conservative extensions of a fixed foundation using some intricate construction principle (cauchy sequences, Dedekind cuts) means, that using different definitions can make it impossible to align a theory in one library to the corresponding theory in a different library, even though from a mathematical point of view they are “the same” (i.e. isomorphic). And even if the same abstract construction principle is used in two libraries, their implementations in terms of the underlying foundation can be different enough to make it difficult to identify the two resulting theories.
Very little work exists to address this problem. In [OS06a], some support for library integration was present: Defined identifiers could be mapped to arbitrary identifiers ignoring their definition. No semantic analysis was needed because the translated proofs were rechecked by the importing system anyway. This approach was revisited and improved in [KK13a], which systematically aligned the concepts of the basic HOL Light library with their Isabelle/HOL counterparts and proved the equivalence in Isabelle/HOL. The approach was further improved in [GK14a] by using machine learning to identify large sets of further alignments.
The OpenTheory format [Hur09] provides representational primitives that, while not explicitly using theories, effectively permit heterogeneous developments in HOL. The bottleneck here is manually refactoring the existing homogeneous libraries to make use of heterogeneity.
A partial solution aimed at overcoming the integration problem was sketched in [RKS11].
2.3Logical Frameworks
Over the last 20 years, formal systems have added an additional metal level through the introduction of logical frameworks. These provide tools to specify logical systems themselves in a formal way. An overview of the current state of art is given by [Pfe01].
An example for one such framework is LF [HHP93a] (see Section 4.3). Logical frameworks introduce the possibility to additionally reason about logics, as e.g. in Twelf [PS99], which is the currently most mature implementation of LF. Twelf has been used as a basis for the LATIN library (see Section 3.1). Also, since in logical frameworks logics are themselves represented as theories, they allow for defining logical systems heterogeneously, by building them up in a modular way.
Dedukti [BCH12] implements LF modulo rewriting. By supplying rewrite rules (whose confluence Dedukti assumes) in addition to an LF theory, users can give more elegant logic encodings. Moreover, rewriting can be used to integrate computation into the logical framework. A number of logic libraries have been exported to Dedukti, which is envisioned as a universal proof checker. Isabelle [Isa] implements intuitionistic higher-order logic, which (if seen as a pure type system with propositions-as-types) is rather similar to LF. Despite being logic-independent, most of the proof support in Isabelle is optimized for individual logics defined in Isabelle, most importantly Isabelle/HOL and Isabelle/ZF.
Prolog – in its most mature implementation ELPI [Dun+15] – extends the logic programming paradigm with higher-order functionalities. As such, it allows higher-order abstract syntax approaches and can be (and is) used as a logical framework as well.
Unfortunately, logical frameworks are not an efficient alternative to the prevailing homogeneous formal systems. State of the art proof assistents rely heavily on the specific peculiarities of their underlying foundation to provide efficient proof search techniques, which would be impossible to implement at the high level of generality that logical frameworks provide, at least without introducing cosiderable overhead and thus reducing efficiency.
Another problem is that specific concepts used by some logic (noticably record types, subtyping principles, inductive definitions, etc.) may be difficult to realize in a logical framework in such a way, that type checking can be done effectively, since the necessary information for doing so can not always easily be lifted to the rather general level on which the type checker operates (see Part II).
Chapter 3
Context and Contribution
This thesis is part of two ongoing research projects - OAF (Open Archive of Formalizations) and ODK (OpenDreamKit). The goals of OAF in particular largely coincide with the objectives of this thesis, specifically regarding applications for theorem prover libraries; ODK entails work packages extending the same goals to other formal systems, such as computer algebra systems and mathematical databases. Our approach to realizing these projects largely make use of the Math-in-the-Middle architecture (MitM).
Before we describe the precise objectives and contribution of this work, we will go over the above and related projects, as to establish the necessary context.
3.1LATIN
The LATIN project [Cod+11] was a DFG funded project running from 2009 to 2012 under the principal investigators Michael Kohlhase, Florian Rabe and Till Mossakowski. Its aim has been to build a heterogeneous, highly integrated library of formalizations of logics and related languages as well as translations between them. It uses Mmt (see Section 4.1) as a framework, with the logical framework LF (see Section 4.3) as a meta-theory for the individual logics.
True to the general Mmt philosophy, all the integrated theories are built up in a modular way and include propositional, first-order, sorted first-order, common, higher-order, modal, description, and linear logics. Type theoretical features, which can be freely combined with logical features, include the -cube, product and union types, as well as base types like booleans or natural numbers. In many cases alternative formalizations are given (and related to each other), e.g., Curry- and Church-style typing, or Andrews and Prawitz-style higher-order logic. The logic morphisms include the relativization translations from modal, description, and sorted first-order logic to unsorted first-order logic, the negative translation from classical to intuitionistic logic, and the translation from first to sorted first- and higher-order logic.
The left side of Figure 3.1 shows a fragment of the LATIN atlas, focusing on first-order logic (FOL) being built on top of propositional logic (PL), its translation to HOL and ultimately resulting in the foundations of Mizar, Isabelle/HOL and ZFC, as well as translations between them. The formalization of propositional logic includes its syntax as well as its proof and model theory, as shown on the right of Figure 3.1.
3.2The OAF Project
The OAF project [OAF] is a DFG funded project running from 2015 to 2019 under the principal investigators Michael Kohlhase and Florian Rabe. The goal was to provide an Open Archive of Formalizations: a universal archiving solution for formal libraries, corresponding library management services (such as distributing, browsing and searching library contents) and methods for integrating libraries in different formalisms in a unifying framework – namely OMDoc/Mmt (see Section 4.1), as in Figure 3.2 – to allow for sharing and translating content across them. We further want it to be scalable with respect to both the size of the knowledge base and the diversity of logical foundations.
Theoretically, the main prerequisite has been established in the LATIN project (see Section 3.1). However, whereas LATIN provides formalizations of basic logics, type theories and related systems, there still remains the problem of integrating the existing formal libraries of theorem prover systems. Consequently, there are two major objectives of the OAF project this thesis touches on:
Making existing libraries accessible to a unifying frameworkWe want to be able to make available theorem prover libraries accessible to the Mmt system.
Libraries that have been imported into Mmt in the course of this project include HOL Light [KR14], Mizar [Ian+13], TPTP [Sut09], IMPS [Bet18], Coq [MRS], Isabelle (as-of-yet unpublished) and (as presented in detail in Chapter 10) PVS.
Refactoring and integrating librariesOnce we have several libraries integrated into Mmt, we can implement generic knowledge management services for the integrated libraries, such as:
•
•
Naturally, both aspects are heavily interrelated, since integrating libraries is easier after some suitable refactoring, and already aligned libraries can potentially be analyzed more easily and further refactored.
3.3OpenDreamKit
Disclaimer:
The following two sections have been previously published as part of [Deh+16] with coauthors Paul-Olivier Dehaye, Mihnea Iancu, Michael Kohlhase, Alexander Konovalov, Samuel Lelièvre, Markus Pfeiffer, Florian Rabe, Nicolas M. Thiéry and Tom Wiesing.
Neither the theoretical results nor the majority of the writing can be attributed to me personally. These chapters should hence not be considered my contribution, and is mainly included because it is the best description of the OpenDreamKit project and the Math-in-the-Middle approach, which in turn represents a prime concrete application for the results of this thesis.
The Math-in-the-Middle ontology for formal mathematics however was largely developed and curated by me, and can thus be considered my contribution.
As with interactive theorem provers specifically, in the last decades we witnessed the emergence of a wide ecosystem of open-source tools to support research in pure mathematics in general. This ranges from specialized to general purpose computational tools such as GAP [Gro16], PARI/GP, LinBox, MPIR, Sage [Dev16], or Singular [SNG], via online databases like the LMFDB [LMF] or online services like Wikipedia, arXiv [Arx], to webpages like MathOverflow. A great opportunity is the rapid emergence of key technologies, in particular the Jupyter [Jup] (previously IPython) platform for interactive and exploratory computing which targets all areas of science.
This has proven the viability and power of collaborative open-source development models, by users and for users, even for delivering general purpose systems targeting large audiences such as researchers, teachers, engineers, amateurs, and others. Yet some critical long term investments, in particular on the technical side, are in order to boost the productivity and lower the entry barrier:
•
Streamlining access, distribution, portability on a wide range of platforms, including High Performance Computers or cloud services. •
Improving user interfaces, in particular in the promising area of collaborative workspaces as those provided by CoCalc [CC] (previously called SageMathCloud). •
Lowering barriers between research communities and promoting dissemination. For example make it easy for a specialist of scientific computing to use tools from pure mathematics, and vice versa. •
Bringing together the developer communities to promote tighter collaboration and symbiosis, accelerate joint development, and share best practices. •
Structure the development to outsource as much of it as possible to larger communities, and focus manpower on core specialities: the implementation of mathematical algorithms and databases. •
And last but not least: Promoting collaborations at all scales to further improve the productivity of researchers in pure mathematics and applications. OpenDreamKit – “Open Digital Research Environment Toolkit for the Advancement of Mathematics” [ODK] – is a project funded under the European H2020 Infrastructure call [EI] on Virtual Research Environments, to work on many of these problems.
In practice, OpenDreamKit’s work plan consists of several work packages: component architecture (modularity, packaging, distribution, deployment), user interfaces (Jupyter interactive notebook interfaces, 3D visualization, documentation tools), high performance mathematical computing (especially on multicore/parallel architectures), a study of social aspects of collaborative software development, and a package on data/knowledge/software-bases.
The latter package focuses on the identification and extension of ontologies and standards to facilitate safe and efficient storage, reuse, interoperation and sharing of rich mathematical data, whilst taking provenance and citability into account. This package is the most relevant regarding this thesis.
Its outcome will be a component architecture for semantically sound data archival and sharing, and integrate computational software and databases. The aim is to enable researchers to seamlessly manipulate mathematical objects across computational engines (e.g. switch algorithm implementations from one computer algebra system to another), front end interaction modes (database queries, notebooks, web, etc) and even backends (e.g. distributed vs. local).
3.4The Math-in-the-Middle Approach and Library
The Math-in-the-Middle architecture is used heavily in the OpenDreamKit project. It centers around the Math-in-the-Middle ontology, an Mmt archive of formalized math using all of the features introduced in Part II. In particular, this archive provides useful real-world examples for these features. The ontology serves as a mediator between individual systems involved in the OpenDreamKit project to facilitate knowledge exchange and remote procedure calls.
Additionally, the Math-in-the-Middle architecture can be used in other contexts. In particular, it facilitates across-library knowledge management services for formal libraries, as described in Part IV. From this perspective, while this section focuses on the role of the architecture in the OpenDreamKit project, the general ideas underlying this section can be generalized beyond the systems involved in OpenDreamKit.
Since we aim to make our components interoperable at a mathematical level, we have to establish a common meaning space that will allow us to share computation, visualization of the mathematical concepts, objects, and models between the respective systems. This mediation problem is well understood in information systems [Wie92], and has for instance been applied to natural language translation via a hub language [KW03]. Here, our hub is mathematics itself, and the vocabulary (or even language) admits further formalization that translates into direct gains in interoperability. For this reason, neither OpenMath [Bus+04] nor MathML [Aus+03] have the practical expressivity needed for our intended applications.
3.4.1A Common Meaning Space for Interoperability
One problem is that the software systems in OpenDreamKit cover different mathematical concepts, and if there are overlaps, their models for them differ, and the implementing objects have different functionalities. This starts with simple naming issues (e.g. elliptic curves are named ec in the LMFDB, and as EllipticCurve in Sage), persists through the underlying data structures and in differing representations in the various tables of the LMFDB), and becomes virulent at the level of algorithms, their parameters, and domains of applicability.
To obtain a common meaning space for a VRE, we have the three well-known approaches in Figure 3.3.
The first does not scale to a project with about a dozen systems, for the third there is no obvious contender in the OpenDreamKit ecosystem. Fortunately, we already have a “standard” for expressing the meaning of mathematical concepts – mathematical vernacular: the language of mathematical communication, and in fact all the concepts supported in the OpenDreamKit VRE are documented in mathematical vernacular in journal articles, manuals, etc. The obvious problem is that mathematical vernacular is too i) ambiguous: we need a human to understand structure, words, and symbols ii) redundant: every paper introduces slightly different notions.
Therefore we explore an approach where we flexiformalize (i.e. partially formalize; see [Koh13a]) mathematical vernacular to obtain a flexiformal ontology of mathematics that can serve as an open communication vocabulary. We call the approach the Math-in-the-Middle (MitM) Strategy for integration and the ontology the MitM ontology.
The descriptions in the MitM ontology must simultaneously be system-near to make interfacing easy for systems, and serve as an interoperability standard – i.e. be general and stable. If we have an ontology system that allows modular/structured ontologies, we can solve this apparent dilemma by introducing interface theories [KRSC11], i.e. ontology modules (the light purple circles in Figure 3.4) that are at the same time system-specific in their description of mathematical concepts – near the actual representation of the system – and part of the greater MitM ontology (depicted by the cloud in Figure 3.4) as they are connected to the core MitM ontology (the blue circle) by views we call interface views. The MitM approach stipulates that interface theories and interface views are maintained and released together with the respective systems, whereas the core MitM ontology represents the mathematical scope of the VRE and is maintained with it. In fact in many ways, the core MitM ontology is the conceptual essence of the mathematical VRE.
Apart from OpenDreamKit, the MitM ontology is furthermore used in the FrameIT [RKM16] approach for developing serious games for math education, and the MaMoRed project ([Koh+17b],[Kop+18]) with the goal of building a modular library of computational models in physics and related STEM fields.
3.5Objectives
The OAF and OpenDreamKit projects have a common aim, differing only in their domains of applicability – or aspects of the tetrapod (see Chapter 1): The integration of (libraries of) formal systems (inference) in the former, and mathematical software and databases (computation/tabulation) in the latter. The goal of this thesis is to outline an approach to solve this integration problem and to demonstrate its feasibility. This entails the following objectives:
O1
O2
O3
3.6Contribution and Overview
In Part II, I present a modular logical framework (O2) that extends LF by various features and is used in the remainder of this work. In particular, I explain in detail how we can use Mmt to develop such a framework and extend it by virtually arbitrary additional features via various means. Where issues arise (e.g. with advanced subtyping principles, see Chapter 7), I discuss these and suggest solutions.
Notably, unlike other fixed-foundation frameworks, we can do so without needing to express our principles in terms of a more primitive logic. Instead, we can use the conveniences of an extensive API providing all of the necessary and most common abstractions in a modern, production-ready programming language (Scala) and all the infrastructure that comes with it – such as modern IDEs with static type checking and debugging functionality, extensive library support, etc.
In Part III I demonstrate the OAF (and partially ODK) approach to integrating libraries and ontologies into the Mmt system (O1), using the theorem prover system PVS (Chapter 10), the computer algebra systems GAP and Sage (Chapter 12), and the database LMFDB (Chapter 11) as representative examples. The formalization of the foundational logic of PVS in Section 10.2 in particular requires and makes use of the logical framework presented in Part II. Since that logical framework is easily extendable – and since PVS can be considered particularly challenging as evidenced by the fact that no earlier representations of the PVS foundation exist in other frameworks – the approach likely scales to virtually arbitrary systems and their foundations.
As a result, we have, for the first time, all symbols from both the underlying foundations and ontologies as well as the associated libraries of formal systems accessible from within the same framework. In particular, we preserve the original presentation and semantics without the need to build dedicated (and sometimes impossible or unsound) translations between foundations.
Finally, Part IV covers knowledge management services (O3) that can now be implemented generically and foundation-independently, and which are consequently available for any (pair of) systems integrated into Mmt now or in the future. In particular, I present:
•
•
•
•
The last four points particularly achieve the main goals of the OAF project, and ultimately enable the intended results of (our part of) the OpenDreamKit project. The translation method in particular allows for transfering virtually arbitrary formal content between libraries and foundations, thus facilitating e.g. remote procedure calls between systems – in other words, it allows for flexibly and fully integrating formal systems and their libraries.
As such, this thesis demonstrates the feasibility of our approach to library integration in enabling proper knowledge management beyond individual systems and libraries within a unifying framework and system, bringing us one step closer to a full tetrapodal system.
Consequently this work covers a broad spectrum, ranging from purely theoretical developments down to implementation details.
Much of the content of this thesis has been published in previous papers. These are listed separately in the bibliography (Section 18).
Chapter 4
Preliminaries
Throughout this thesis we will use the Mmt language and system as a meta-logical framework. Mmt is a highly flexible system and API using the foundation-independent OMDoc language for its backend. Within Mmt, we can implement almost arbitrary logical frameworks; most prominently, it comes with an implementation of LF which we extend and use throughout this thesis.
Disclaimer:
The text of the this chapter has been assembled and adapted from various previously published papers by (primarily) Florian Rabe, Michael Kohlhase and myself (all of which are referenced in this thesis), since they have emerged as the most suitable, concise introductions to the concepts described herein. The exposition has been thoroughly reworked for a uniform representation, but some text fragments of the original might remain. Original publications are cited where adequate.
4.1OMDoc/Mmt
OMDoc [Koh13b] is a representation language developed by Michael Kohlhase, that extends OpenMath and MathML and allows for providing general definitions of the syntax of both mathematical objects as well as mathematical documents. They make use of content dictionaries, which introduce primitive notions and their semantics. In particular, it can represent exports of formal system libraries and their documentation.
In the last ten years, (chiefly) Florian Rabe redeveloped the fragment of OMDoc pertaining to formal knowledge resulting in the OMDoc/Mmt language [RK13a; HKR12a; Rab17b]. OMDoc/Mmt greatly extends the expressivity, clarifies the representational primitives, and formally defines the semantics of this OMDoc fragment. It is designed to be foundation-independent and introduces several concepts to maximize modularity and to abstract from and mediate between different foundations, to reuse concepts, tools, and formalizations.
More concretely, the OMDoc/Mmt language integrates successful representational paradigms
•
the logics-as-theories representation from logical frameworks, •
theories and the reuse along theory morphisms from the heterogeneous method, •
the Curry-Howard correspondence from type theoretical foundations, •
URIs as globally unique logical identifiers from OpenMath, •
the standardized XML-based interchange syntax of OMDoc, and makes them available in a single, coherent representational system for the first time. The combination of these features is based on a small set of carefully chosen, orthogonal primitives in order to obtain a simple and extensible language design.
OMDoc/Mmt offers vey few primitives, which have turned out to be sufficient for most practical settings (a more detailed grammar is given in Section 4.1.1). These are
1.
2.
3.
4.
Using these primitives, logical frameworks, logics and theories within some logic are all uniformly represented as Mmt theories, rendering all of those equally accessible, reusable and extendable. Constants, functions, symbols, theorems, axioms, proof rules etc. are all represented as constant declarations, and all terms which are built up from those are represented as objects.
Theory morphisms represent truth-preserving maps between theories. Examples include theory inclusions, translations/isomorphisms between (sub)theories and models/instantiations (by mapping axioms to theorems that hold within a model), as well as a particular theory inclusion called meta-theory, that relates a theory on some meta level to a theory on a higher level on which it depends. This includes the relation between some low level theory (such as the theory of groups) to its underlying foundation (such as first-order logic), and the latter’s relation to the logical framework used to define it (e.g. LF).
All of this naturally gives us the notion of a theory graph, which relates theories (represented as nodes) via vertices representing theory morphisms (as in Figure 4.1), being right at the design core of the OMDoc/Mmt language.
Given this modular approach of OMDoc/Mmt, heterogeneity is made explicit in the sense, that even though foundations are present via meta-theories, only those aspects of the foundation that are used in the definition of a theory are present in the theory itself, making it easy to abstract from the foundation and reuse the theory.
The OMDoc/Mmt language is used by the Mmt system [RK13a], which provides a powerful API to work with documents and libraries in the OMDoc/Mmt language, including a terminal to execute Mmt specific commands, a web server to display information about Mmt libraries (such as their theory graphs) and plugins for the text editors/IDEs jEdit and IntelliJ IDEA, that can be used to create, type check and compile documents into the OMDoc/Mmt language. The API is heavily customizable via plugins to e.g. add foundation specific type checking rules and import and translate documents from different formal systems.
All of this puts Mmt on a new meta level, which can be seen as the next step in a progression towards more abstract formalisms as indicated in the table below. In conventional mathematics (first column), domain knowledge is expressed directly in ad hoc notation. Logic (second column) provided a formal syntax and semantics for this notation. Logical frameworks (third column) provided a formal meta-logic in which to define this syntax and semantics. Now Mmt (fourth column) adds a meta-meta-level, at which we can design even the logical frameworks flexibly. (This meta-meta-level gives rise to the name Mmt with the last letter representing both the underlying theory and the practical tool.) That makes Mmt very robust against future language developments: We can, e.g., develop LF +X without any change to the Mmt infrastructure and can easily migrate all results obtained within LF.
|
|
|||
|
Mathematics
|
Logic
|
Meta-Logic
|
Foundation-Independence
|
|
|
|||
|
|
|
|
Mmt
|
|
|
|
logical framework
|
logical framework
|
|
|
logic
|
logic
|
logic
|
|
domain knowledge
|
domain knowledge
|
domain knowledge
|
domain knowledge
|
|
|
4.1.1Mmt Syntax
Intuitively, OMDoc/Mmt is a declarative language for theories and views over an arbitrary object language. For the purposes of this thesis, we will work with the (only slightly simplified)
grammar given in Figure 4.2.
In the simplest case, theories are lists of constant declarations , where is an expression that may use the previously declared constants. Naturally, must be subject to some type system (which MMT is also parametric in), for the purposes of this thesis primarily the logical framework LF (see Section 4.3) and extensions thereof.
Mmt achieves language-independence through the use of meta-theories: every Mmt-theory may designate a previously defined theory as its meta-theory. For example, when we represent the HOL Light library in MMT, we first write a theory for the logical primitives of HOL Light. Then each theory in the HOL Light library is represented as a theory with as its meta-theory. In fact, we usually go one step further: itself is a theory, whose meta-theory is a logical framework such as LF. That allows to concisely define the syntax and inference system of HOL Light.
Correspondingly, a view is a list of assignments of -expressions to -constants . To be well-typed, must preserve typing, i.e., we must have . Here is the homomorphic extension of , i.e., the map of -expressions to -expressions that substitutes every occurrence of a -constant with the -expression assigned by .
We call simple if the expressions are always -constants rather than complex expressions. The type-preservation condition for an assignment reduces to where and are the types of and . We call partial if it does not contain an assignment for every -constant and total otherwise.
Importantly, we can then show generally at the Mmt-level that if is well-typed, then preserves all typing and equality judgments over . In particular, if we represent proofs as typed terms (see Section 4.3), views preserve the theoremhood of propositions. This property makes views so valuable for structuring, refactoring, and integrating large corpora.
Syntactically:
•
•
•
Remark 4.1:
The module system is conservative: every theory can be elaborated (flattened) into one that only declares constants, by recursively replacing every statement by the list of constants declared in the included theory. We will occasionally reference the result of flattening a theory as .
Similarly, we can eliminate defined constants by replacing every occurence of the constant symbol by its definition (definition expansion).
We allow definitions in variable contexts for purely technical reasons, most notably in the context of record types (see Section 8.2). As a result, there is a notable similarity between variable contexts and theories; in fact, we can think of flattened theories as named variable contexts.
It remains to define the exact syntax of expressions. In the grammar in Figure 4.2, refers to constants and refers to bound variables.
Simple expressions are either references to constants (of the meta-theory, an included theory or previously declared in the current theory) or to bound variables . Complex expressions are of the form , where
•
is the operator that forms the complex expression, •
declares a variable of type that is bound by in subsequent variable declarations and in the arguments, •
is an argument of The bound variable context may be empty, and we write instead of . For example, the axiom would instead be written as
To refer to constants (and modules), OMDoc/Mmt employs globally unique URIs, which are composed of a namespace, the name of the (containing) module and the name of a constant, separated by question marks.
Hence, Mmt URIs are triples of the form
The namespace part is a URI that serves as a globally unique root identifier of a corpus, e.g. . It is not necessary (although often useful) for namespaces to also be URLs, i.e., a reference to a physical location. But even if they are URLs, we do not specify what resource dereferencing should return. Note that because Mmt URIs use as a separator, is the query part of the URI, which makes it easy to implement dereferencing in practice.
The module and symbol parts of an Mmt URI are logically meaningful names defined in the corpus: The module is the container (i.e. a theory, view or advanced structural feature) and the symbol is a name inside the module (of a constant, include or more advanced feature). Both module and symbol name may consist of multiple /-separated segments to allow for nested modules and qualified symbol names.
Mmt URIs allow arbitrary Unicode characters. However, and , which are used as delimiters, as well as any character not legal in URIs must be escaped using the %-encoding (RFC 3986/7 standard).
By using URIs, namespaces have the great advantage of being guaranteed to be globally unique. This comes at the prize of being rather long. However, the CURIE standard can be used to introduce short prefixes.
For simplicity in the remaining part of the paper we will rarely use complete HTTP links, but rather use single keyword abbreviations.
Mmt offers an additional theory morphism that we will occasionally but rarely use in this thesis. A structure is a theory morphism that behaves like an include in that it makes all constants in accessible to . Unlike includes, however, structures
1.
2.
Example 4.1:
Assume we have a theory (declaring a binary operation and a unit ), which is extended by a theory . We can specify a theory by including two structures and . Unlike with theory includes, the two resulting morphisms from to (one directly, and one via ) are not identified, giving us two distinct operations. Furthermore, and can rename the operations to and respectively, and rename the units to 0 and 1.
4.1.2MathHub
MMT is used as a backend for the MathHub system [Ian+14; MH] – an online portal for mathematical documents. It is available on and hosts various Mmt repositories, interfaces for browsing and semantic search [KŞ06], as well as several additional applications, such as the semantic glossary for mathematics (SMGLoM) [Gin+16].
Most of the specific OMDoc/Mmt formalizations in this thesis are hosted on MathHub’s GitLab instance () and linked accordingly.
4.2The Mmt API
The Mmt system works on content represented in a fragment of OMDoc, a fact that is reflected by the internal datastructures corresponding to the syntax stated in Figure 4.2. All expression-level syntactic constructs are implemented as Scala case classes extending the class Term, and are named after the corresponding XML nodes of the underlying OMDoc.
Mmt URIs (see Section 4.1.1) are implemented as objects of the class Path, which are assembled from individual path components of type LocalName. The relevant subtypes are DPath for namespaces, MPath for module URIs of the form (d : DPath) ? (ln : LocalName), and GlobalName for declaration URIs of the form (m : MPath) ? (ln : GlobalName).
Figure 4.3 gives a translation of the abstract syntax for expressions described above into the internal datastructures.
|
Abstract Syntax
|
Scala Syntax
|
|
|
|
|
Symbol References
|
OMID(S : ContentPath)
|
|
Module References
|
OMMOD(M : MPath)
= OMID(M)
|
|
Constant References
|
OMS(C : GlobalName)
= OMID(C)
|
|
Variable References
|
OMV(V : LocalName)
|
|
|
|
|
Applications
|
OMA(f : Term, args : List[Term])
|
|
Binding Applications
|
OMBIND(f : Term, con : Context, arg: Term)
|
|
|
|
|
Context
|
Context(vars : List[VarDecl])
|
|
Variable Declaration
|
VarDecl(v : LocalName, t : Option[Term], d : Option[Term])
|
Figure 4.3: Internal Scala Datastructures for Mmt Syntactical Constructors
For a detailled and up-to-date description of the classes provided by the Mmt API, I refer to the Mmt documentation.
4.2.1Judgments
Mmt has a component called solver, parametrized by rules, that checks judgments and infers implicit arguments (solving placeholder variables). The solver uses a bidirectional type system, i.e., we have two separate judgments for type inference and type checking. To check a typing judgment we have two possibilities:
1.
2.
The solver starts with a top-down approach and defers to the bottom-up approach if necessary. Equality checks are used for solving variables and by using additional solution rules. This approach is described in [Rab17a] in detail and is largely irrelevant for the purposes of this thesis, hence details are omitted. The details on implementing typing rules in Mmt are treated by example in Chapter 5.
Similarly to typing, we have two equality judgments: one for checking equality of two given terms and one for reducing a term to another one (simplification).
Our judgments are given in Figure 4.4.
|
|
|
|
Judgment
|
Intuition
|
|
|
|
|
|
is a well-formed context
|
|
|
|
|
|
is inhabitable (may occur on the right side of a typing judgment)
|
|
|
|
|
|
is a universe (inhabitable, and every with is inhabitable)
|
|
|
|
|
|
checks against inhabitable term .
|
|
|
|
|
|
type/kind of term is inferred to be
|
|
|
|
|
|
and are equal at type
|
|
|
|
|
|
computes to
|
|
|
|
|
|
is a subtype of
|
|
|
Figure 4.4: Judgments
For each of these rules, the solver assumes certain pre/postconditions to hold, which critically need to be preserved by the implemented rules. These are:
•
•
•
•
•
Remark 4.2:
It is sufficient (and desirable) to consider subtyping to be an abbreviation: iff for all we have implies . This allows for many subtyping rules to be derivable (usually) from typing rules.
Remark 4.3: Horizontal Subtyping and Equality
The equality judgment could alternatively be formulated as an untyped equality . That would require some technical changes to the rules presented in this thesis, but would usually not be a huge difference. In our case, however, the use of typed equality is critical for the records introduced in Chapter 8; see Remark 8.2.
The rules given in Figure 4.5 capture the intended semantics of the judgments above and can be assumed to hold for any type system implemented in Mmt.
The upper row contains the rules for contexts. Note that even though we allow the type
𝑇
of a variable to be omitted (which will be helpful for records later), that is only allowed if a definiens
𝑡
is present. (
𝑇
must be inferable from
𝑡
or otherwise known from the environment.)
The second row contains the rules for looking up the type and definition of variable.
The third row contains the bidirectionality rules, which algorithmically are the default rules that are applied when no type (resp. equality) checking rules are available: switch to type inference (resp. computation) and compare inferred and expected type (resp. the results).
The last row deals with universes and inhabitability.
Remark 4.4:
The inference rules as presented in this thesis always carry an algorithmic reading. To be precise, a rule of the form
can be implemented thusly:
|
|
|
|
To prove that the judgment
𝐶
holds, check each of the premises
𝑃
1
...
𝑃
𝑛
in order. Consequently, a premise should only contain variables or non-primitive symbols that 1) occur in the conclusion or 2) are introduced (either by type inference or simplification) in a previous premise. For example, the last rule in Figure 4.5 would still be adequate if we were to replace the premise
Γ
⊢
𝑇
⇒
𝑈
by the more general (and hence weaker) premise
Γ
⊢
𝑇
⇐
𝑈
, making the inference rule itself stronger; however, it would be unclear how to implement such a rule efficiently, since there are potentially infinitely many candidates for
𝑈
. The inference judgment
Γ
⊢
𝑇
⇒
𝑈
on the other hand unambiguously determines
𝑈
as the principal type of
𝑇
for the subsequent premise
Γ
⊢
𝑈
univ
. Both judgments are needed to solve unknown variables variables, however.
An equality premise (such as
𝑇
≡
∏
𝑥
∶
𝐴
𝐵
) with as-of-yet unknown symbols (
𝑥
,
𝐴
,
𝐵
) can algorithmically be interpreted as a pattern match; i.e. the premise that
𝑇
can be simplified to have the specific syntactic form on the right hand side of the equation. This is often necessary for the inferred type of a term, as e.g. in the elimination rule for
∏
(Figure 4.8).
Consequently, we introduce the notation
𝑡
≡
⇒
𝐸
(
⋅
)
as abbreviation for the two consecutive premises
𝑡
⇒
𝑇
;
𝑇
≡
𝐸
(
⋅
)
.
4.3Typing Rules and LF
LF [HHP93a] is a logical framework based on the dependently-typed lambda calculus
𝜆Π
.
The grammar is given in Figure 4.6. The only deviation here is that we allow optional definitions in contexts, to make them syntactically equivalent to Mmt contexts (see Remark 4.1). LF specifically as implemented in Mmt is covered in more detail in [Rab17a]; the specific rules presented here are adapted from there.
The grammar in Figure 4.6 as well as the various grammars presented in Part II represent object-level languages independent of the Mmt language. If implemented within Mmt, it should not be seen as extending the Mmt grammar from Section 4.1.1 rather than being embedded within it; for example, the well-formed expression
𝜆
𝑥
∶
𝐴
.
𝑡
according to the LF grammar would be represented in OMDoc/Mmt as the well-formed expression
𝜆
[
𝑥
∶
𝑎
]
(
𝑡
)
according to the Mmt grammar.
Note that the grammar for contexts aligns perfectly with the grammar for Mmt contexts; hence we do not need to distinguish between the two apart from the well-formedness of the term components of the variables therein.
The universe rules of the framework are given in Figure 4.7 and are to be read in conjunction with the general rules in Figure 4.5.
Remark 4.5:
Usually, a new typing feature entails a type constructor
𝐶
(in this case
∏
), one or several term constructors
t
(in this case
𝜆
) and one or several elimination constructors
e
(in this case function application), and the rules governing these operators usually follow a specific pattern consisting of:
•
•
•
•
•
•
•
An additional rule – usually called the
𝜂
-rule – that declares the introductory form to be surjective is often used in formal presentations. It could reasonably be called representation rule, since it tells us that e.g. every function can be represented as a
𝜆
-term. However, it often is a corollary of the equality and computation rules and hence does not need to be implemented as a separate rule.
In an implementation, it makes sense to add additional derivable inference, type checking, equality or computation rules, simply to speed up the type checking process or help with
solving implicit arguments. For the latter, the Mmt API offers an abstract class for SolutionRules, but we will largely ignore them in this thesis.
Unless otherwise specified, we assume the type constructor itself to be injective modulo
𝛼
-renaming, i.e. two types
𝐶
(
𝑎
)
,
𝐶
(
𝑏
)
are equal iff
𝑎
=
𝑏
and variables bound by
𝐶
in
𝐶
(
𝑎
)
and
𝐶
(
𝑏
)
can be suitably renamed.
The specific rules for the dependent function types in LF are given in Figure 4.8. Note, that even though LF has precisely two universes
type
and
kind
, we still formulate the rules parametric in a generic universe
𝑈
, to allow for flexible extending the number of universes later (see Chapter 6, particularly Section 6.4).
Definition 4.1:
For convenience, we introduce a notation for the larger of two inhabitable terms, which we will extend as needed. We start with the cases relevant to LF:
•
For |
|
if
|
|
|
if
|
|
undefined
|
otherwise.
|
•
For •
•
In all other cases, we consider We consider
𝑇
→
𝑇
′
an abbreviation for
∏
∶
𝑇
𝑇
′
.
We also write
𝑇
[
𝑥
/
𝑇
′
]
for the usual capture-avoiding substitution of
𝑇
′
for
𝑥
in
𝑇
.
_
We can now show that the usual variance rule for function types and the
𝜂
-rule are derivable. Even though plain LF has no notion of subtyping (in fact, every term has a unique type), extensions of LF may introduce subtyping principles, in which case this rule becomes important:
Proof.
Γ
⊢
𝐴
<∶
𝐴
′
,
Γ
⊢
𝐵
′
<∶
𝐵
. It suffices to show
Γ
,
𝑓
∶
∏
𝑥
∶
𝐴
′
𝐵
′
⊢
𝑓
⇐
∏
𝑥
∶
𝐴
𝐵
.
1.
Assume By Figure 4.8 we can conclude this from
Γ
,
𝑓
∶
∏
𝑥
∶
𝐴
′
𝐵
′
,
𝑥
∶
𝐴
⊢
𝑓𝑥
⇐
𝐵
. Since
Γ
⊢
𝐴
<∶
𝐴
′
we have
Γ
,
𝑥
∶
𝐴
⊢
𝑥
⇐
𝐴
′
and consequently
Γ
,
𝑓
∶
∏
𝑥
∶
𝐴
′
𝐵
′
,
𝑥
∶
𝐴
⊢
𝑓𝑥
⇐
𝐵
′
, so the claim follows by
Γ
⊢
𝐵
′
<∶
𝐵
.
2.
By the equality rule, we can show that given Moreover, we can show that every well-typed term
𝑡
has a principal type
𝑇
in the sense that (i)
Γ
⊢
𝑡
⇐
𝑇
and (ii) whenever
Γ
⊢
𝑡
⇐
𝑇
′
, then also
Γ
⊢
𝑇
<∶
𝑇
′
. The principal type is exactly the one inferred by our rules (see Theorem 8.1). This is easily proven by induction on the grammar; however, as a result this property does not necessarily hold in the presence of other rules.
Remark 4.6: PLF
We can conveniently extend LF by shallow polymorphism, the resulting logical framework we call PLF (see [Rab17a]). It can be specified by a single additional rule:
|
|
|
|
This rule elegantly allows for declaring shallow polymorphic functions (with type parameters only on the outside of the term), but since the polymorphic function types are merely inhabitable and not typed, they can only occur in well-typed expressions if the type arguments are instantiated via function application (the only
∏
-rule that requires a function type to be typed by a universe is the formation rule, see Figure 4.8, hence type inference of polymorphic function types will fail, but of their applications will not).
For example, consider a type of groups over a fixed given type
𝐺
. Using the above rule, we can specify this as a polymorphic operator
group
∶
∏
𝐺
∶
type
type
In this case, expressions such as
𝐺
∶
group
(
𝑆
)
are perfectly valid, since
group
is well-typed. The type of group however is not well-typed; hence we can not e.g. quantify over all operators of type
∏
𝐺
∶
type
type
.
4.3.1Judgments-as-Types
The lambda calculus described above makes LF a functioning logical framework, i.e. a language to specify the syntax and proof theory of various logics, type theories and related systems. This is achieved by using the
Judgments-as-Types methodology, in which (as the name suggests) we assign types to the judgments of a given logic. We can interpret these types as the types of proofs that the judgment holds.
Example 4.2:
Consider classical propositional logic. We can formalize the syntax by introducing a new type
prop
∶
type
in LF, and declaring the logical connectives as constants with function types; e.g.
and
∶
prop
→
prop
→
prop
. Given
𝐴
,
𝐵
of type
prop
,
and
(
𝐴
,
𝐵
)
is now a sytactically well-formed expression of type
prop
.
The only judgment in propositional logic is that some proposition is true, hence we introduce an operator
∶
prop
→
type
assigning a type to each proposition. As mentioned, given some formula
𝜑
we can think of the type
𝜑
as the type of proofs of
𝜑
. By declaring a new constant of type
𝜑
we can axiomatically declare
𝜑
to be true, whereas to prove
𝜑
we need to construct a term of that type.
DED
DED
DED
We can allow for the latter by formalizing the rules of any proof calculus as functions on
-types. Consider as an example the conjunction introduction rule:
Using
, we can now declare this rule by introducing a function:
andI
∶
∏
𝐴
∶
prop
∏
𝐵
∶
prop
𝐴
→
𝐵
→
and
(
𝐴
,
𝐵
)
It takes two propositions
𝐴
,
𝐵
as arguments, as well as (what we can think of as) proofs for both, and returns (what we can think of as) a proof for
𝐴
∧
𝐵
.
DED
|
|
|
|
DED
DED
DED
DED
Remark 4.7:
The
andI
-rule in the last example suggests reading dependent function types in a different way: while we can think of the type as a function type, we can also read the expression as “For all
𝐴
∶
prop
and for all
𝐵
∶
prop
...”. This reading is formally justified by the Curry-Howard correspondence between a type
∏
𝑥
∶
𝐴
𝐵
and the universal quantifier
∀𝑥
∶
𝐴.
𝐵
.
Analogously, a simple function type
𝐴
→
𝐵
corresponds to the implication
𝐴
⇒
𝐵
.
Under the Curry-Howard correspondence, the introduction and elimination rules of a typing feature often mirror the respective inference rules in a natural deduction calculus.
4.3.2Higher-Order Abstract Syntax
In this thesis we use the same notation for application of LF functions f to arguments
𝑎
as for the Mmt primitive application of a term
𝑓
to a list of terms
𝑎
. This is to avoid cumbersome notations for LF-applications like
𝑓@
𝑎
. However, it should be noted that technically (and internally) an application
𝑓
(
𝑎
)
of a function
𝑓
on the level of LF-expressions is internally represented as
apply
(
𝑓
,
𝑎
)
- the symbol
apply
provided by the LF-theory is applied (in the sense of an OMDoc/Mmt-application) to the arguments
𝑓
and
𝑎
. In a naive implementation, this means that the head of the term
𝑓
(
𝑎
)
is not in fact the symbol
𝑓
, but instead the symbol
apply
.
Furthermore, if we use LF to formalize another
𝜆
-calculus
𝒞
, we will need to provide constants for the
𝜆
- and apply-operators of
𝒞
. This naively means, that we need to reimplement many mechanisms which are already present in LF. Fortunately, we can use higher-order abstract syntax to have our new symbols inherit their core behaviors from the corresponding symbols in
.
LF
Example 4.3:
𝜆
-Calculus
We can formalize
𝒞
by introducing constants
𝒞
type
∶
type
𝒞
expr
∶
𝒞
type
→
type
funtype
∶
𝐶
type
→
𝐶
type
→
𝐶
type
representing the types of
𝒞
, the expressions of type
𝐴
in
𝒞
and (simple) function types.
_
_
_
_
_
_
We can then formalize a
𝜆
-operator as an LF-function that “abuses” the LF-lambda as a binder for the variable of a
𝜆
-term; hence taking a function as argument. The
𝒞
-expression
𝜆𝑥
∶
𝐴.𝑡
(
𝑥
)
can then be represented as
𝒞
lambda
(
𝜆
𝑥
∶
𝒞
expr
(
𝐴
)
.
𝑡
(
𝑥
))
, and the application
𝑓
(
𝑎
)
as
𝒞
apply
(
𝑓
,
𝑎
)
, which internally gets added a second layer, i.e.
apply
(
apply
(
𝒞
apply
,
𝑓
)
,
𝑎
)
_
_
_
_
All of this makes it quite convenient to formalize type systems in LF. Internally, however, it is rather inconvenient to handle even simple function applications in
𝒞
, because of the nesting of
apply
-symbols on the various meta-levels involved.
For that reason, Mmt offers functionality that allows to signify symbols belonging to a higher-order abstract syntax, and abstract them away in settings where we would rather have the term look like
𝑓
(
𝑎
)
than
apply
(
apply
(
𝒞
apply
,
𝑓
)
,
𝑎
)
– for example when implementing rules.
_
Example 4.4:
Let us focus on the higher-order abstract syntax rules for LF. The relevant symbols that the system needs to know about are LF’s
lambda
and
apply
operations. Applications of
apply
to terms
𝑓
and
𝑎
are then internally simplied to applications of
𝑓
to
𝑎
directly, and terms of the form
𝑓
(
𝜆
𝑥
∶
𝐴
.
𝑡
)
are simplified to binding applications
𝑓
[
𝑥
∶
𝐴
]
(
𝑡
)
. The symbols are bundled in a simple case class HOAS, which is passed on to a class extending HOASNotation that takes care of the computational aspects. The case for LF hence looks like this:
4.4Mmt/LF Surface Syntax
By and large, I will show relevant examples in Mmt surface syntax directly whenever it is informative to do so. In this section I will consequently briefly introduce the structural syntax for Mmt and the relevant notations for symbols provided by LF. An in-depth tutorial on using the Mmt surface language to formalize mathematical content is available online.
3
Especially noteworthy is Mmt’s usage of highly non-standard delimiters, namely higher unicode symbols. This is in order to allow for all standard delimiters used by other systems to be used in notations when implementing the foundations of these. A structural environment is delimited according to the level on which it is implemented (see Figure 4.2): Modules are delimited with the symbol
|
|
|
|
|
|
|
|
, declarations with
|
|
|
|
and objects with
|
|
. The colors correspond to the ones used in the available Mmt IDEs (jEdit and IntelliJ IDEA).
A theory named
T
with meta-theory
M
is given with the syntax theory T : M =
<
𝑐𝑜𝑛𝑡𝑒𝑛𝑡
>
|
|
|
|
|
|
|
|
. If no namespace is provided, Mmt will either take one provided by the meta information of the containing archive, or use a default namespace. A local namespace can be provided beforehand using the namespace command. Using the latter, the following will establish an empty theory with full URI
http
∶
//mathhub.info/example
?
T
:
The theory providing LF has URI
http
∶
//cds.omdoc.org/urtheories
?
LF
; however, Mmt provides the default CURIE abbreviation
ur
for its namespace - hence to create a new theory with LF as meta-theory, we can write theory T : ur:?LF =
|
|
|
|
|
|
|
|
.
A constant is declared by giving its name, followed by its components separated by the object delimiter
|
|
in arbitrary order. A component is either a type (starting with
∶
), a definiens (starting with
=
) or a notation (starting with
Γ
), and is delimited with the declaration delimiter
|
|
|
|
– e.g. a constant
c
with type
T
and definiens
t
can be given either as c : T
|
|
= t
|
|
|
|
or c = t
|
|
: T
|
|
|
|
. Includes are given with include ?T
|
|
|
|
A notation is provided as a sequence of tokens and argument markers followed by a precedence. Argument markers are either a number literal
𝑛
(denoting the
𝑛
th argument of the constant), or
V1
denoting a variable to be bound by the constant. If
𝑉1
is followed by a
T
, the system expects the variables provided in an application to be typed, infering their types if none are provided by a user. If an argument marker is followed by a token
𝑠
followed by ellipses
...
, the system recognizes that a sequence of arguments is expected, the elements of which are delimited by
𝑠
. Precedences are given as
prec
n
for some (positive or negative) integer
n
– the higher the precedence, the stronger the notation binds. We will examine the notations provided by LF as an example:
Dependent function type
∏
𝑥
∶
𝐴
,
𝑦
∶
𝐵
𝐶
are written as {x:A,y:B}C – a notational practice inspired by the Twelf system. Consequently, the notation for the symbol
Pi
representing dependent function types has two argument markers: The first one being a comma-separated list of typed variables (represented as
V1T
,
...
) and the second one being a regular argument. Consequently, the theory
?
LF
declares
Pi
as Pi # { V1T,...} 2 prec −10000
|
|
|
|
. It is given a deliberately extremely low precedence, since dependent function types most commonly occur on the outside of a term. Analogously, lambda is given the notation [ V1T,...] 2 prec −10000.
Simple function types have the notation # 1
→
...prec −9990. Finally, as usual in lambda calculi, function application has the notation # 1%w...prec −10, where
Γw
represents arbitrary whitespace.
The syntax for views is rather analogous to theories. As a module, it is delimited with the module delimiter
|
|
|
|
|
|
|
|
. A view named
v
with domain
T1
and codomain
T2
is given via view v : ?T1 −> ?T2 =
<
𝑐𝑜𝑛𝑡𝑒𝑛𝑡
>
|
|
|
|
|
|
|
|
. Assignments are considered declarations and use
=
as its symbol; i.e. an assignment
𝑐
↦
𝑡
is given as c=t.
For any
𝑈
with
Γ
⊢
𝑈
univ
:
Formation:
Introduction:
Elimination:
Type Checking:
Equality:
Computation:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Figure 4.8: Rules for Dependent Function Types
Part II
Modular Design of Logical Frameworks
One crucial aspect of connecting libraries of mathematical knowledge to a universal framework – whether they come from theorem provers, computer algebra systems, databases or other systems – is to specify the ontological primitives of the system within the framework. Correspondingly, sufficiently expressive logical frameworks are needed to specify all the features such a system might offer.
A logical framework like LF [HHP93b], Dedukti [BCH12], or
𝜆
-Prolog [MN86] tends to admit very elegant definitions for a certain class of logics (such as first-order or higher-order logic, modal logics or classical type theories), but definitions can get awkward quickly if logics fall outside that fragment.
This often boils down to the question of shallow vs. deep encodings. The former represents a logic feature (e.g., subtyping) in terms of a corresponding framework feature, whereas the latter applies a logic encoding to remove the feature (e.g., encode subtyping in a logical framework without subtyping by using a subtyping predicate and coercion functions). Deep encodings have two disadvantages:
1.
2.
Even if we ignore the proof theory (and thus the use of decision procedures) entirely, PVS (which we discuss more in-depth and integrate in Chapter 10) is particularly challenging in this regard, and hence serves as a good example for the challenges involved:
•
•
•
•
•
Florian Rabe has extensively investigated definitions of PVS in logical frameworks, going back more than 10 years when a first (unpublished) attempt to define PVS in LF was made by Schürmann as part of the Logosphere project [Pfe+03]. In the end, all of the above-mentioned difficulties pointed in the same direction: the logical framework must adapt to the complexity of PVS– any attempt to adapt PVS to an existing logical framework (by designing an appropriate deep encoding) is likely to be doomed. This negative result is in itself a notable contribution of [Koh+17c], where the PVS import described in Chapter 10 was first published. Since publication, we have found this to also hold for similarly complex object logics such as Coq.
Hence we need logical frameworks with advanced features. However, not all additional features should be active and available simultaneously all the time – the additional inference rules, even if in conjunction logically sound, can cause problems such as slowing down type checking
, reserving notations, making the type system unnecessarily complex and potentially destroying adequacy of a library translation. Especially the first can happen easily in the presence of a critical pair of typing rules – two rules concluding the same judgement, but from different (and potentially expensive to check) premises. An example for such a critical pair on the meta-level is the two fundamental strategies in a bidirectional type checking algorithm (see Section 4.2.1).
4
What we actually need in practice is therefore a modular meta logical framework with a library of optional features that can be included on demand.
This library can be seen as an analog for LATIN (see Section 3.1) for logical frameworks. Where LATIN resulted in a modular library for logics using a fixed logical framework (namely LF), what we need to go beyond the “simple” logics in LATIN is a similarly modular library of logical frameworks, written in an appropriate meta-framework.
This is more difficult than it seems in that the rules for each feature must be implemented in such a way that they are orthogonal – they should be compatible (if possible) with each other and be minimally dependent on other features. A natural starting point of such a framework is the set of orthogonal features of Martin-Löf type theory [ML94], which are covered in Chapters 5 and 6. One of these – dependent function types – is already implemented by LF and described in Section 4.3.
Various extensions of LF have been proposed and designed before; [Pfe93] presents an extension of LF with refinement types (and intersection types, which we will cover in Section 7.3), [Sto+17] adds additional union types (essentially equivalent to coproducts, see Section 6.1). The grammatical framework GF [Ran04] is based on an extension of LF by record types.
Apart from the latter, all the mentioned extensions lack an implementation, and all of them have been designed with the additional feature as a primitive – in other words, the rules are intrinsically non-compositional and consequently not suited for a modular approach. The advantage of the primitive approach is that it allows for tightly controlling the interactions of different primitive typing features. This makes it relatively easy to ensure some desirable aspects of the typing system in the presence of several features at once, such as good performance or decidability. In fact, [Pfe93] makes decidability one of its central results. Our modular approach on the other hand makes it considerably more difficult to verify these kinds of properties. However, adding e.g. predicate subtyping to a logical framework – as is necessary for formalizing PVS for example (see Chapter 10) – renders the type system immediately undecidable anyway. This also holds for many other features that increase the expressivity of a logical framework, such as the model types presented in Chapter 8. Since expressivity is our primary concern, we consider losing decidability less of a problem in the context of this thesis. Similarly, in a tradeoff between modularity/expressivity and performance, we prioritize the former.
Conveniently, the modular implementation of the type checking component of Mmt, and the abstractions provided in the Mmt API, result in a very small difference between the theoretical design of a logical framework and its implementation in Mmt. In fact, the set of formal rules as developed on paper corresponds closely to the set of implemented classes in Scala.
4
In this part, I will present various extensions of LF. In doing so, I will explain in detail how to implement new language features in Mmt. For that reason, we will start with a relatively simple feature – namely
∑
-types in Chapter 5 – as an introduction to the relevant classes of the Mmt API. Chapter 6 covers additional simple typing features (primarily those of Martin-Löf type theory), culminating in a logical framework based on homotopy type theory [Uni13] as a case study in Section 6.5. Our approach to implementing logical frameworks in Mmt is briefly described in [MR19], but essentially previously undocumented and hence covered in greater detail.
Continuing with more complicated features, Chapter 7 covers subtyping mechanisms in Mmt, and Chapter 8 covers our implementation of record and model types in detail.
All the rule systems presented in this part follow a general pattern for typing rules presented in Remark 4.5. Also, note Remark 4.4 on how to read our rules algorithmically. The Mmt API provides a dedicated class interface for each kind of rule – Figure 4.9 gives an overview of those covered in this part.
|
General Aspects
|
|
|
Mmt Judgments
|
|
|
Symbol References
|
|
|
Solver Judgments and Continuations
|
|
|
|
|
|
Typing Rules
|
|
|
Type Inference Rules
|
|
|
Type Checking Rules
|
|
|
Subtyping Rules
|
|
|
Universe Rules
|
|
|
|
|
|
Equality Rules
|
|
|
Type Based Equality Rules
|
|
|
Term Head Based Equality Rules
|
|
|
Irrelevance Rules
|
|
|
Computation Rules
|
|
|
|
|
|
Other Features
|
|
|
Generic Literals
|
|
|
Structural Features
|
|
|
Rule Generators / Change Listeners
|
Figure 4.9: Table of Mmt API Classes for Solver Rules
Most inference rules in Chapters 5 and 6 are taken from [Uni13] with minor adaptations to minimize dependencies on other typing features and be consistent in their presentation with the rest of this thesis. Where multiple design options exist, these are discussed explicitly – notably, in that case we do not decide on a specific rule system, but rather implement all of them as separate theories extending the same core rules, allowing to pick the most adequate implementation for a given logic formalization while reusing the same symbols across logics for the required features.
Chapter 5
Implementing Rules in Mmt (e.g.
∑
-Types)
In this chapter we will go over the implementation of
∑
-types in Mmt in detail, as a relatively simple example on how to extend the Mmt solver with rules in a modular fashion. The main contribution of this chapter is the adaption of the typing rules governing
∑
-types into the Mmt framework and their implementation, as well as a detailed explanation of our methodology for constructiong a modular meta logical framework.
5.1The Rules for
∑
-Types
_
_
Since
∑
-types arise naturally as the dependent variant of Cartesian products, their history is difficult to reconstruct. To the best of my knowledge, they, along with most of the features in the subsequent chapter, were originally introduced as a consequence of the Curry-Howard correspondence (as e.g. in [ML94]), where the simple products
𝐴
×
𝐵
correspond to conjunctions
𝐴
∧
𝐵
and dependent
∑
-types
∑
𝑥
∶
𝐴
𝐵
(
𝑥
)
correspond to the existential quantifier
∃𝑥
∶
𝐴.
𝐵
(
𝑥
)
.
The grammar for
∑
-types is given in Figure 5.1.
Remark 5.1:
DED
DED
DED
The rules are correspondingly intuitive and given in Figure 5.2 (adapted from [Uni13]).
Theorem 5.1:
The following subtyping rule is derivable:
Furthermore, the
𝜂
-rule (i.e. for any
𝑝
∶
∑
𝑥
∶
𝐴
𝐵
we have
⟨
𝜋
ℓ
(
𝑝
)
,
𝜋
𝑟
(
𝑝
)
⟩
=
𝑝
) holds.
|
|
|
|
Proof.Analogous to Theorem 4.1
Remark 5.2:
Alternatively, they can be almost defined using
∏
-types in PLF (see Remark 4.6) given a corresponding polymorphic equality
≐
; however,the computation rule for
𝜋
𝑟
(
⋅
)
causes problems. Its formalization would be
elim
r
∶
∏
𝐴
∶
𝑡𝑦𝑝𝑒
∏
𝐵
∶
𝐴
→
type
∏
𝑎
∶
𝐴
∏
𝑏
∶
𝐵
(
𝑎
)
𝜋
𝑟
(
⟨
𝑎
,
𝑏
⟩
)
≐
𝑏
If
≐
is a polymorphic equlity with implicit type argument, this is only well-typed if both
_
sides of the equation (i.e.
𝜋
𝑟
(
⟨
𝑎
,
𝑏
⟩
)
anf
𝑏
) have the same type; however,
𝑏
has type
𝐵
(
𝑎
)
whereas
𝜋
𝑟
(
⟨
𝑎
,
𝑏
⟩
)
has by the elimination rule type
𝐵
(
𝜋
ℓ
(
⟨
𝑎
,
𝑏
⟩
)
)
. To equate these two types, the system needs to be able to exploit the elimination rule for
𝜋
ℓ
(
⋅
)
to rewrite
𝜋
ℓ
(
⟨
𝑎
,
𝑏
⟩
)
as
𝑎
, which needs a mechanism for generating computation rules from object-level equations. Such as mechanism has been implemented in Mmt, but consequently already goes beyond LF as a logical framework.
Before we look at the implementation, a couple of things should be noted:
1.
2.
3.
Remark 5.3: Currying
In the presence of both
∑
-types and function types, it makes sense to consider currying, i.e. the fact that we can think of a function
𝐴
×
𝐵
→
𝐶
as a function
𝐴
→
𝐵
→
𝐶
– or in the dependent case, a function
∏
𝑝
∶
∑
𝑥
∶
𝐴
𝐵
(
𝑥
)
𝐶
(
𝑝
)
as a function
∏
𝑥
∶
𝐴
∏
𝑦
∶
𝐵
(
𝑥
)
𝐶
(
𝑥
,
𝑦
)
(and of course analogously
𝐶
∶
(
∑
𝑥
∶
𝐴
𝐵
(
𝑥
)
)
→
𝑈
as
𝐶
∶
∏
𝑥
∶
𝐴
𝐵
(
𝑥
)
→
𝑈
).
It is debatable whether currying should be considered a judgment-level equality. If we want this to be the case, we can add two additional ComputationRules (see Section 5.2.6) that take care of
∏
-types and
𝜆
-expressions with
∑
-type arguments.
As mentioned in the introduction of this part, for sake of modularity we do not fix a specific rule set, but would rather implement such a rule in a separate theory to be included o excluded as appropriate for any specific use case. It should be noted however, that a currying rule has no notable disadvantages.
5.2Implementing Rule Systems in Mmt
5.2.1Mmt Symbols in Scala
Before implementing the rules themselves, we create an Mmt theory that contains the symbols we need, so that we can assign them proper notations determining their syntactic usage and refer to them in the implementations of the rules. In the case of
∑
-types, this theory contains a symbol for the type constructor
∑
, the introduction form
⟨
⋅
,
⋅
⟩
and the two elimination forms
𝜋
ℓ
(
⋅
)
,
𝜋
𝑟
(
⋅
)
. For convenience we add an additional symbol for simple products
𝐴
×
𝐵
. The resulting theory is given in Listing 5.1.
Listing 5.1: The
Symbols
Theory
1
Note that we specified the notations of
Sigma
,
Product
and
Tuple
all to be flexary – hence our rule will have to deconstruct a type
∑
𝑥
∶
𝐴
,
𝑦
∶
𝐵
𝐶
into the actual type
∑
𝑥
∶
𝑎
∑
𝑦
∶
𝐵
𝐶
, and analogously a tuple
⟨
𝑎
,
𝑏
,
𝑐
⟩
into the actual term
⟨
𝑎
,
⟨
𝑏
,
𝑐
⟩
⟩
.
For convenience, we can implement helper objects in Scala, that provide apply and unapply methods to easily construct – and pattern match against – applications of our symbols. These can not only take care of the flexary notations, but also take care of
Product
being an abbreviation, which makes implementing our rules easier and more uniform.
Remark 5.4:
In Scala, if an object Foo has an apply-method it can be called by the name of the object directly; i.e. Foo(args) is an abbreviation for Foo.apply(args). An unapply-method on the other hand can be used to pattern match, i.e. if we write
then upon encountering the case Foo(args) during runtime, the method Foo.unapply will be called on t and returns an Option[A] (for arbitrary type A). If the result is an instance of Some(ret), then in the pattern match above the case Foo(args) applies and args will be instantiated with ret. This makes Scala an extremely convenient language to use when wanting to handle complex terms of specific syntactic forms, as here.
Example 5.1:
The helper objects for Product and Sigma are given in Listing 5.2.
Listing 5.2: Helper Objects
2
The class LFSigmaSymbol merely provides the full path of the symbol under consideration (so we only need to provide the name) and the Term-object used to refer to the symbol in an expression. Note, that the unapply-method of Sigma has a case for Product.term, making sure that any application of the Product-symbol pattern matches against Sigma as well. This means that when implementing our rules, we only ever need to check whether a term is a
∑
-type knowing that in doing so we cover the simple products as well. The analogous objects for pairs and projections are called Tuple and Proj1 and Proj2 respectively.
5.2.2Solver Judgments
The solver checks a judgment
Γ
⊢
𝐽
by collecting all rules included in the context
Γ
that correspond to the type of the judgment (an extension of the class Judgment), checking which of them are applicable and trying them in order of priority. These Judgments are:
•
The judgment is checked by iteratively simplifying both terms using ComputationRules (see Section 5.2.6) until they are syntactically equal or an EqualityRule (see Section 5.2.5) is applicable.
•
•
•
To implement our rules, the Mmt API provides the mentioned abstract classes for the different rule types. All of their extensions need to implement an apply-method with a particular signature. A Solver is passed on to the rule to allow for recursive checking and inference calls.
The most important methods a Solver provides for a rule are:
•
•
•
•
•
5.2.3Inference Rules
Formation, introduction and elimination rules are all trivial extensions of the class InferenceRule. They take as class argument the head symbol of the terms to which they apply (and the typing symbol used for the typing judgments) and its apply-method has the following signature:
where tm is the term whose type is to be inferred, and covered is a boolean flag that tells the rule whether the term tm has already been checked to be well-typed, and hence that the preconditions (see Section 4.2.1) are satisfied.
The implicit argument stack contains the context in which the type of tm is to be inferred. history is an object that holds textual information on the current state of the solver – if a check fails, the history is presented to the user as feedback.
An inference rule with head
ℎ
is considered applicable to a term, if the head of the term is
ℎ
. Additionally, a rule can override a method alternativeHeads:List[GlobalName] if it should be applicable to other heads as well, as in our case with
Sigma
and
Product
(see Example 5.3).
Example 5.2:
Listing 5.3 shows the Scala code for the introduction rule:
The rule itself has as parameters the Mmt URI for the
Tuple
-symbol and the symbol for the typing operator (ultimately provided by LF). The apply-method
|
|
|
_
|
Line 7
deconstructs the term to be a Tuple(t1,t2), i.e. of the form Line 8
infers the type of t1 to be tpA (or returns None if this fails), Line 11
infers the type of t2 to be tpB (or returns None if this fails), Line 14
picks a fresh variable name xn not occuring in the current context and Line 15
returns the type of the tuple term xn as Some(Sigma(xn,tpA,tpB)), i.e. the type Listing 5.3: Introduction Rule
3
5.2.4Checking Rules
Similarly to inference rules, TypingRules (which trivially extend CheckingRule) take the head symbol of the type to which they apply as argument. They too have an overridable method alternativeHeads:List[GlobalName] if multiple head symbols are covered. Additionally, checking rules can shadow other rules (by overriding the method shadowedRules:List[Rule], in which case the shadowed ones are deactivated in the presence of the new rule. Furthermore, checking rules can be assigned a priority : Int with default value 0 to control in which order the solver attempts multiple applicable rules. The signature of their apply-method that needs to be implemented is
Alternatively to returning a simple Some(Boolean) value, checking rules can also:
1.
2.
3.
Example 5.3:
Listing 5.4 shows the implementation of the type checking rule for
∑
-types:
|
|
|
|
Line 4
adds the Line 8
deconstructs the type to be a Sigma(x,tpA,tpB), i.e. of the form Line 9
instructs the solver to verify the judgment that Proj1(tm) (i.e. Line 10
instructs the solver to verify the judgment that Proj2(tm) (i.e. Listing 5.4: Type Checking Rule
4
5.2.5Equality Rules
The Mmt API provides three classes for equality rules: TypeBasedEqualityRule, TermBasedEqualityRule and TermHeadBasedEqualityRule.
•
They take as class parameters the head symbol of the governing type, as well as a (possibly empty) list of application-symbols, in case that the type constructor is applied using a higher-order abstract syntax. This can be used to implement equality rules for e.g. symbols from a logic or type theory which is itself implemented in a logical framework.
Additionally, an instance of this class has to implement an applicableToTerm-method. This is convenient in case where the rule requires the two terms to have a specific form.
The signatures of the methods to implement are:
The applicableToTerm-method will be called on both terms and the rules is considered applicable if either returns true.
The apply should return Some(true) or Some(false) if the equality of the two terms tm1 and tm2 is provable or disprovable, and None if the solver should proceed trying other equality rules.
•
The applicable-method should fail quickly; preferentially it should only check whether the terms tm1 and tm2 have a specific syntactic form.
The apply-method only takes a CheckingCallback, which is a superclass of Solver with slightly limited functionality.
•
Example 5.4:
In the case of
∑
-types, we have a typed equality rule (for pairs at their
∑
-types):
Hence we choose a TypeBasedEqualityRule, which is given in Listing 5.5.
|
|
|
|
Line 4
adds the Line 5
determines that the rule is applicable to any pair of terms. Line 10
deconstructs the type to be a Sigma(x,tpA,tpB), i.e. of the form Line 11
checks that the projections Line 12
checks that the two terms Proj2(tm1) and Proj2(tm2) are equal under the type Listing 5.5: Equality Rule
5
5.2.6Computation Rules
Computation rules implement the class ComputationRule, which takes as class parameter the head symbol of the term to decide applicability. This is only a default implementation, though – the governing method def applicable(tm : Term): Boolean can be overridden. The apply-method has the following signature:
It returns an object of type Simplifiability that instructs the solver (or other CheckingCallback instance) on how to recurse into a term. The latter has by and large three possible instances:
1.
2.
3.
The solver can use the information returned by a ComputationRule to strategically recurse only into those subterms that might cause a rule to become applicable, without unnecessarily simplifying subexpressions or expanding definitions.
Example 5.5:
In the case of
∑
-types, we have the following computation rules:
The implementation of the first one is presented in Listing 5.6. The second one is completely analogous.
|
|
|
|
|
|
|
|
Line 5
is the intended case for the rule – a projection Line 6
To satisfy the precondition, we call inferType(b,false) if tm has not been checked to be well-typed (i.e. covered is false). Line 8
is the case where we have a projection Line 9
is the default case, where tm is not a projection. By virtue of the class parameter Proj1.path, this case should never happen, but the general information to return (Simplifiability.NoRecurse) would be that this rule is never applicable to the term given, unless the head changes (to an actual projection). Listing 5.6: Computation Rule
6
5.2.7Subtyping Rules
Finally, to implement subtyping rules Mmt offers a generic class SubtypingRule, and conveniently a more specific class VarianceRule for the case where we want to declare an operator to be covariant or contravariant in its arguments. The former needs to implement an applicable(tp1 : Term, tp2 : Term)-method, the latter takes a head symbol as class argument that determines the applicability of the rule. In both cases, the apply-method has the signature
Example 5.6:
In the case of
∑
-types, we have the derivable subtyping rule given in Theorem 5.1:
The implementation of this rule is presented in Listing 5.7.
|
|
|
|
Line 7
deconstructs both types into Line 8
instructs the solver to check that a1 is a subtype of a2. Line 9
picks a new variable name xn not occuring in the current context. Lines 10–13
instruct the solver to check that in the context extended by xn:a1 (stack ++ xn%a1), b1 ^? (x1/OMV(xn)) is a subtype of b2 ^? (x2/OMV(xn)) (i.e the original variable names are replaced by the one introduced in Line 9). Listing 5.7: Subtyping Rule
7
After implementing the rules in Scala, we can import them into an Mmt theory. Including the latter into any theory will then activate those rules. It makes sense to declare the symbols and rules in separate theories, so we can reuse the same symbols with different sets of typing rules, as in Listing 5.8. Ultimately, we implement four theories in total: for the symbols (
Symbols
), the rules (
Rules
), a theory that combines symbols and rules in the context of the basic typing rules (see Figure 4.7) from LF (
TypedSigma
), and finally one that adds symbols and rules on top of the full LF theory with dependent function types (
LFSigma
). We will roughly follow this naming convention throught this part; the various theories for e.g. the symbols are disambiguated by the namespaces (in this case
𝚑𝚝𝚝𝚙
:
//
𝚐𝚕
.
𝚖𝚊𝚝𝚑𝚑𝚞𝚋
.
𝚒𝚗𝚏𝚘
/
𝙼𝙼𝚃
/
𝙻𝙵𝚇
/
𝚂𝚒𝚐𝚖𝚊
).
Listing 5.8: The
Rules
Theory
8
The
import
statement in the first line of Listing 5.8 introduces an abbreviation for the namespace
𝚜𝚌𝚊𝚕𝚊
:
//
𝚂𝚒𝚐𝚖𝚊
.
𝙻𝙵𝚇
.
𝚖𝚖𝚝
.
𝚔𝚠𝚊𝚛𝚌
.
𝚒𝚗𝚏𝚘
, which by virtue of the
𝚜𝚌𝚊𝚕𝚊
-scheme determines the fully qualified class path of the scala objects referenced by the subsequently declared rule constants.
5.3Using
∑
-Types
The theory in Listing 5.9 from the Math-in-the-Middle library serves as an example; it uses
∑
-types to implement the product of two vector spaces, by defining the corresponding operations on the product space – since the latter is a natural occurence of Cartesian products in mathematics,
∑
-types are naturally ideal for formalizing products of spaces (analogous e.g. product topologies, categories etc.). In fact, thanks to
∑
-types,the definitions for the universes, operations, units and inverses here conforms quite naturally to the usual informal presentation of product spaces.
Listing 5.9: Product Spaces using Sigma Types
9
For any
𝑈
with
Γ
⊢
𝑈
univ
:
Formation:
Introduction:
Elimination:
Type Checking:
Equality:
Computation:
|
|
|
|
|
|
|
_
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Figure 5.2: Rules for
∑
-Types
Chapter 6
LFX
Many of the features in this chapter are primitives in homotopy type theory, and are hence treated in some detail in [Uni13]. The contribution of this chapter consists, as the previous chapter, in adapting the rules for the typing features covered here to the Mmt framework and their implementation. Together and correspondingly modularly implemented, these features yield a flexible, modular meta logical framework following the LATIN approach.
6.1Coproducts
Coproducts
𝐴
⊕
𝐵
can be seen as disjoint union types – their introduction form is a simple embedding, the elimination a case distinction on the constituent types
𝐴
and
𝐵
. In particular, coproducts give us a relatively simply implementable mechanism for pattern matching.
Under propositions-as-types, coproducts are the type-level correspondants to logical disjunction. Additionally, they arise rather naturally as the (cateogory theoretical) dual notion to simple Cartesian products.
To be more precise:
•
•
match
•
match
match
The grammar for coproducts is given in Figure 6.1
Remark 6.1: Function Addition
The elimination form basically gives us functions on
⊕
-types via case distinction, each case of which we can think of as a lambda expression. Hence it makes sense to add the following abbreviation:
Definition 6.1:
For
𝑓
∶
∏
𝑥
∶
𝐴
𝐶
(
𝑥
↪
ℓ
𝐵
)
and
𝑔
∶
∏
𝑥
∶
𝐵
𝐶
(
𝐴
𝑟
↩
𝑥
)
, let:
𝑓
⊕
𝑔
∶=
𝜆
𝑦
∶
𝐴
⊕
𝐵
.
𝑦
{
𝑥
═
⇒
𝐶
(
𝑥
)
𝑓
(
𝑥
)
|
𝑔
(
𝑥
)
}
∶
∏
𝑥
∶
𝐴
⊕
𝐵
𝐶
(
𝑥
)
match
(In the independent case
𝑓
∶
𝐴
→
𝐶
and
𝑔
∶
𝐵
→
𝐶
, we get
𝑓
⊕
𝑔
∶
𝐴
⊕
𝐵
→
𝐶
)
The implementation consists of a single computation rule for terms
𝑓
⊕
𝑔
.
The basic rule system is given in Figure 6.2, but can be extended for a more desirable behavior in presence of subtyping.
For any
𝑈
with
Γ
⊢
𝑈
univ
:
Formation:
Introduction:
Elimination:
Computation:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
match
|
|
|
|
match
|
|
|
|
match
|
Figure 6.2: Rules for Coproducts
In a categorial setting, coproducts are dual to certesian products as covered in Chapter 5. As a result, the roles of introduction and elimination forms in our rule pattern are somewhat reversed. Consequently, there are a couple of things to note about our rules:
•
•
•
match
•
|
For
match
match
|
|
|
The crucial premise here is the second one, since to evaluate the
-expression, we will need to infer the type of
𝑝
. Hence this typing rule would in practice reduce to checking whether the inferred type of
𝑝
is a subtype of
𝐴
⊕
𝐵
, something the solver checks as a default anyway.
match
•
•
match
It is thus debatable whether “equal for any particular introduction form” should imply “equal everywhere”. If we answer this question positively we imply axiom K [HS02], which is equivalent to term irrelevance for equality types
and hence incompatible with e.g. HoTT (see Section 6.5) and similar constructive systems. Hence, if we want to add an equality rule to preserve extensionality, we should do so in a separate theory and exclude it when formalizing constructive logics and type theories.
1
6.1.1Subtyping Behavior of Coproducts
We have previously used the type checking rules as a justification for derivable subtyping rules. Since a type checking rule for arbitrary elements of
𝐴
⊕
𝐵
does not exist (in a meaningful way), we can instead introduce a subtyping rule and use that to derive type checking rules for the introductory forms. The obvious guiding criterion to use is that any pattern match on
𝐴
⊕
𝐵
needs to be applicable to any
𝑥
∶
𝑆
for
𝑆
<∶
𝐴
⊕
𝐵
. This implies the following covariance rule:
from which we can derive the following typing rules:
This is especially useful in the presence of an empty type
∅
(see Section 6.2), where
𝑎
↪
ℓ
∅
now type checks against any
𝐴
⊕
(analogously for
𝑟
↩
). However, we also need to add an additional equality rule then, since all left injections
𝑎
↪
ℓ
𝐵
𝑖
should be equal (at some
⊕
-type). Hence:
|
|
|
|
|
|
|
|
|
|
|
|
_
|
|
|
|
|
|
|
|
6.1.2Implementation
Given the rules in Figure 6.2 and the examples in Chapter 5, an implementation of Coproducts is rather straight-forward.
Listing 6.1 and Listing 6.2 show excerpts of the associated Mmt theory and a toy example for testing the implementation. Notably, both
⊕
and
are implemented with flexary notations; a helper object (similar to
∑
-types, see Chapter Chapter 5) takes care of deconstructing expressions with multiple arguments into binary applications of the symbol.
match
Notably, coproducts are not very useful in and of themselves; however, in conjunction with a unit type they are sufficient to construct finite types (see Section 6.2), and are used to construct
-types (see Section 6.3).
W
Remark 6.2:
The notation of coproducts as an “additive” construct
𝐴
⊕
𝐵
is suggestive of a more general correspondance:
Thinking of coproducts as sums, we can think about the result of iteratively taking the coproduct of a fixed type
𝐵
over some index type
𝐴
; i.e.
|
𝐴
|
. Naturally, an element of this type consists of some element
𝑏
∶
𝐴
and its index
𝑎
∶
𝐴
; i.e. we can identify this type with the type
𝐴
×
𝐵
. Analogously, taking an indexed coproduct over a type family
𝐵
(
𝑎
)
we can think of the type
⊕
𝑎
∶
𝐴
𝐵
(
𝑎
)
as the dependent sigma type
∑
𝑎
∶
𝐴
𝐵
(
𝑎
)
. In other words: A product is an iterated sum, reminiscent of arithmetics.
|
|
|
|
times
Analogously, we can think of a dependent function type
∏
𝑎
∶
𝐴
𝐵
(
𝑎
)
as the iterated product
𝐵
(
𝑎
1
)
×
𝐵
(
𝑎
2
)
×
...
, an element of which consists of exactly one element
𝑏
∶
𝐵
(
𝑎
)
for each
𝑎
∶
𝐴
, which we can naturally interpret as a set of pairs, i.e. a function.
Note that this is merely an intuitive correspondence, but neither relevant nor helpful from an implementation perspective.
6.2Finite Types
Finite types
are types with a fixed, finite number of elements. Their predetermined number of elements can be exploited using case distinction. Their usefulness is immediately obvious when trying to formalize e.g. finite groups and functions between them, but they are also ubiquitous in e.g. object-oriented programming languages. Naturally, it is sufficient to have for each natural number
𝑛
one type with
𝑛
elements, since all finite types with the same number of elements are isomorphic. Therefore we will focus on this approach, since it conveniently introduces names for the members of finite types and spares us the trouble of having to label them.
3
Starting with the empty type
∅
, we face the question how to formally declare that this type has no elements. Since we can always declare a new constant
null
∶
∅
we need to make sure that the existence of this constant leads to a contradiction. We can do so by allowing to construct a term of any type given an element of
∅
, which (using Curry-Howard or judgments-as-types) corresponds to the principle ex falso quodlibet. There are three ways to go about this:
3
1.
2.
3.
The obvious advantage of the second approach is that it allows using
↪
𝐴
⋅
like an actual typed function. The disadvantage is that using this approach requires function types to be present, hence reducing modularity, and the advantage is negligable.
The third approach has the advantage that it obviates the need for an elimination form
↪
⋅
⋅
completely – a postulated element
𝑒
∶
∅
already gives us an element of every other type, namely itself.
Given an additional singleton type
with a single element
⋆
, we can use the coproducts implemented in Section 6.1 to construct the remaining finite types. This way, we can use the elimination form for coproducts to pattern match an element of a finite type, obviating the need for a dedicated elimination form. There are two ways to do this:
Unit
1.
0
1
Unit
2
1
1
3
1
2
2
1
1
3
2
1
1
1
1
2
3
2
3
2.
0
0
1
Unit
0
2
Unit
1
3
Unit
2
i
j
|
|
|
1
Unit
0
|
Unit
|
|
|
Unit
|
2
Unit
1
|
Unit
Unit
|
|
|
Unit
Unit
|
3
Unit
2
|
Unit
Unit
Unit
|
and
𝑖
is the same expression at all numerical types. In particular, we can easily define the successor function as
succ
(
𝑛
)
∶=
𝑟
↩
𝑛
.
Unit
Conveniently, defining
∅
to be a subtype of any other type allows for implementing both variants.
In the presence of function types, we can also define
as
∅
→
∅
, knowing that the only possible inhabitant of this type is the function
⋆
=
𝜆
𝑥
∶
∅
.
𝑥
=
𝜆
𝑥
∶
∅
.
↪
∅
𝑥
. However, even using this definition, we still have to provide a rule for the solver to be able to exploit this knowledge (and to make sure that
↪
∅
𝑥
is the identity).
Unit
All of this leaves us with several options listed in Figure 6.3 and the basic rules in Figure 6.4.
|
|
|
higher finite types
|
elimination
|
||||||||
|
|
|||||||||||
|
finite types
|
primitive
|
primitive
|
primitive
|
dedicated case construct
|
|||||||
|
|
|||||||||||
|
finite types +
|
primitive
|
primitive
|
definable using
|
match
|
|||||||
|
|
|||||||||||
|
finite types +
|
|
|
primitive
|
dedicated case construct
|
|||||||
|
|
|||||||||||
|
finite types +
|
|
|
definable using
|
match
|
Figure 6.3: Possible Implementations for Finite Types
Of course, constructing enumeration types and their elements by nesting
⊕
-types and injections is simple to formally specify and implement, but inconvenient to use in practice. Furthermore, while it is possible to build up enumeration types using
⊕
even in the absence of coproducts by adapting their rules, the most general and user-friendly way to implement finite types is via a primitive type constructor on actual integers.
We can do so using a type
of number literals, which we discuss further in Section 6.2.1:
We implement a type constructor
(
𝑖
)
for the type
, and analogously a constructor
𝑗
(
𝑖
)
for
𝑖
∶
. In this case, it is desirable that
(
𝑥
)
is a valid type even if
𝑥
is a variable or otherwise undetermined and
𝑗
(
𝑥
)
checks against type
(
𝑗
)
iff
𝑥
<
𝑗
. If
𝑥
and
𝑗
are not definite number literals, this implies proving a judgment
Γ
⊢
𝑥
<
𝑗
, which requires corresponding proof rules for the ordering on natural numbers and a way for the user (and the solver) to construct (and find) such proofs.
Nat
enum
i
case
j
enum
case
enum
If we want to consider enumeration types as subtypes of each other, the argument
𝑗
in
𝑗
(
𝑖
)
becomes unnecessary, so we can write
(
𝑖
)
instead.
case
case
Remark 6.3:
In service of clarity, I will not introduce object-level judgments here, since they require corresponding proof rules which are not really relevant for this chapter. Suffice it to say that Mmt offers a method Solver.prove(tm : Term), that returns true if and only if the Solver manages to find any element p of type tm. This method can cover premises of the form
Γ
⊢
𝑥
<
𝑦
for any choice of object-level judgments (and their proofs). Of course this immediately makes type checking undecidable, and the power of the rule system ultimately depends on the power of the internal prover, which in the case of Mmt is rather weak.
Figure 6.5 gives a set of rules for these
-types, where
represents the (or any variant of, in the case of subtyping) defined finite type corresponding to the (definite) number literal
𝑛
, and
𝑛
represents an element of such a type (again, for a definite number
𝑛
). Which of the optional rules to implement depends on which of the implementation approaches discussed above we choose.
enum
n
To summarize: Finite types give us various ways of implementing their different aspects, depending on which additional typing features are available in any context and how we want them to behave with respect to subtyping. Note however, that implementing all constructors as primitives still allows us to reuse the symbols already implemented for other typing features: A primitive elimination form for finite types can reuse the
-symbol from the
Symbols
theory for coproducts, and we can reuse the
⊕
-symbol to construct higher finite types even in the absence of the rules governing coproducts. Even the rules governing these symbols can be reused with minimal adaptation (to restrict the types that are valid arguments to finite types).
match
This allows for developing library content with minimal dependencies, allowing for adding new features (e.g. coproducts or function types) as needed without reimplementing content that was developed without these features. The additional features only provide new definitions for the already implemented formalizations without changing their semantics significantly. Consequently, we will omit the rules for the constructs already covered in the previous sections in the basic rules listed in Figure 6.4.
6.2.1Generic Literals in Mmt
To implement
and
, we need to be able to explicitly use numbers and have them be syntactically valid and semantically meaningful Mmt terms – i.e. we want them to be literals. The Mmt API offers a class RealizedType for generic literals, which combine a syntactic type (an inhabitable Mmt term) with a SemanticType. A SemanticType represents an actual Scala class holding the values of our literals, as well as a lexer used to parse them in surface syntax.
enum
case
Additionally, the Mmt API already implements SemanticTypes for the most prevalent literals and their corresponding Scala implementations, namely BigInt (unrestricted integers), String, Double (floating point numbers), Boolean etc. In our specific case, we can use StandardNat extends SemanticType, which uses BigInt values restricted to positive numbers.
Mmt’s
urtheories
library [OMU] also already has a theory
NatSymbols
with a type
NAT
that we want to use as a syntactic type. To couple the two, we import a rule in an Mmt theory, which we either implement in Scala (using the class RealizedType), or even more conveniently by import the parametric rule Realize with the name of the SemanticType and the syntactic type as parameters:
Hence, by importing the above theory we can use integer literals which will be assigned the principal type
NatSymbols
?
NAT
.
6.2.2Implementation and Irrelevance Rules
One noteworthy aspect of the rules in Figure 6.4 is that the equality rule tells us that all terms of a specific type are equal – in other words, the type
has (at most, i.e. in this case) exactly one element. While mathematically both of these statements are logically equivalent, they are intrinsically different from the point of view of an implementation.
Unit
In an implementation, to leverage the first statement we need two syntactically distinct terms that can then be shown to be equal. Hence this rule will only ever become relevant in presence of two distinct terms of the same type
. The second statement however tells us that whenever we need any term of type
, e.g. to solve a variable of type
, there is exactly one element that we can use.
Unit
Unit
Unit
There is a dedicated rule class for these situations, namely TypeBasedSolutionRules. These not only extend TypeBasedEqualityRules, but also tell the system that which term of the given type is used is irrelevant, and more specifically, how a variable of this type should be solved. This is particularly relevant in the presence of judgments-as-types and proof arguments, where usually the specific term (i.e. proof) of a given judgment type
𝐽
is irrelevant, as long as there is any such term (this is called proof irrelevance), in which case the solution rule can activate the prover.
Example 6.1:
The solution rule for the
type is given in Listing 6.3.
Unit
Line 2
makes this rule applicable also to Pi-terms. This is to guarantee compatibility when defining Unit
_
Line 5
checks that tp is the Unit
Listing 6.3: The solution rule for
Unit
4
Listing 6.4 shows the Mmt theories associated with all the above options for finite types. Note their modular development, offering theories that mix in basic LF, or coproducts, or both, with or without subtyping. Figure 6.6 shows the corresponding development graph with the symbol theories at the bottom.
Notably, finite types allow us to implement finite functions in an exploitably computational manner. As an example, Listing 6.5 shows a formalization of the finite group
Z
/2
Z
and its group operation in such a manner, that the solver manages to compute
0
○
0
=
0
from the definition of the operation. While the definition of
op
is syntactically ugly, one could imagine using a structural feature (see Section 6.3) specifically to introduce finite functions.
6.3
-Types
W
W
Ignoring the category theory involved, how to represent inductive types as
-types is best explained by example:
W
5
Example 6.2:
•
Their
-type would thus be
N
∶=
𝑥
∶
𝑥
{
𝑦
═
⇒
type
|
}
.
W
W
2
match
0
1
•
Unit
Their
-type would thus be
List
𝐴
∶=
𝑥
∶
⊕
𝐴
𝑥
{
𝑦
═
⇒
type
|
}
.
W
W
Unit
match
0
1
•
Their
-type would thus be
Tree
𝐴
∶=
𝑥
∶
𝐴
⊕
𝐴
𝑥
{
𝑦
═
⇒
type
|
}
.
W
W
match
0
2
As the examples show, constructors that depend on an element of another type
𝐴
are considered separate constructors for each
𝑎
∶
𝐴
.
Remark 6.4:
Since the number and arities of constructors in a
-type are almost always finite but need encoding as types,
-types (at least as described here) are almost useless without finite types. Moreover, dependent inductive types such as lists over a type
𝐴
additionally require coproducts.
W
W
However, note that our
-types do not strictly require coproducts or finite types a priori – they are merely a lot less useful without these features.
W
As introduction form for a
-type
𝑊
∶=
𝑥
∶
𝐴
𝐵
(
𝑥
)
, we have a supremum-operator
sup
𝑐
{
𝑥
═
⇒
𝑎
(
𝑥
)
}
, where
𝑐
∶
𝐴
tells us the constructor case,
𝑥
∶
𝐵
(
𝑐
)
is a variable that represents the inductive parameters needed, and
𝑎
(
𝑥
)
∶
𝑊
represents the arguments for the constructor. How this works is again best understood by example:
W
W
Example 6.3:
•
W
2
match
0
1
•
W
Unit
match
0
1
Unit
•
W
match
0
2
match
We can think of the argument
𝑎
(
𝑥
)
in
sup
𝑐
{
𝑥
═
⇒
𝑎
(
𝑥
)
}
for a type
𝑊
∶=
𝑥
∶
𝐴
𝐵
(
𝑥
)
as a function that assigns each parameter position of type
𝑊
of a constructor (such as the two subtrees in the last example) to the corresponding argument. In the constant cases,
𝑥
has type
∅
and we need to construct an element of
𝑊
from it, hence in these cases the argument is always
↪
𝑊
𝑥
.
W
Elimination of
-Types is cumbersome and technical, so it is worth going into some detail. To eliminate an element of a type
𝑊
∶=
𝑐
∶
𝐴
𝐵
, we want to define a function
𝑓
out of
𝑊
with target type
𝐶
(
𝑤
)
(for
𝑤
∈
𝑊
) via well-founded recursion. For every constructor case
𝑐
∶
𝐴
, we have an arity of
𝐵
(
𝑐
)
, so we would have to define
𝑓
on
𝑐
and the
𝐵
(
𝑐
)
many (recursive) arguments of type
𝑊
. The latter can be encoded (or thought of) as a function
𝑔
𝑐
∶
𝐵
(
𝑐
)
→
𝑊
. Additionally, we need to be able to use the recursive values of the very function
𝑓
we are defining on the recursive arguments encoded in
𝑔
𝑐
. These we can similarly think of encoded as a function
ℎ
𝑐
,
𝑔
𝑐
∶
∏
𝑏
∶
𝐵
(
𝑐
)
𝐶
(
𝑔
𝑐
(
𝑏
))
W
W
Hence, to be able to define
𝑓
we need an element
𝑐
∶
𝐴
,
a way to obtain
𝑔
𝑐
(
𝑦
)
∶
𝑊
for any
𝑦
∶
𝐵
(
𝑐
)
for the recursive arguments, and a way to obtain
ℎ
(
𝑏
)
∶
𝐶
(
𝑔
𝑐
(
𝑏
))
for any
𝑏
∶
𝐵
(
𝑐
)
. The easiest way to do so is to have the elimination constructor bind variables for these, some of them representing (dependent) functions. This requires function types, of course. In the absence of function types, the rules governing them can be added without making the symbols associated (lambda-abstraction, function type constructors etc.) available to the user. Alternatively, one could implement dedicated syntactic constructs that basically mirror the application of functions in the particular context of recursive definitions on
-types, which is even more cumbersome, but theoretically straight-forward. I will hence for simplicities sake assume function types to be present in the remainder of this Section.
W
Consequently, we introduce an elimination operator
(
𝑤
)
{
(
𝑐
,
𝑔
,
ℎ
)
═
⇒
𝐶
(
𝑤
)
𝑒
(
𝑐
,
𝑔
,
ℎ
)
}
, with the defining equation:
rec
|
|
rec
|
|
|
|
rec
|
|
Example 6.4:
•
W
2
match
0
1
1
rec
match
hence using the defining equation for
, we get
rec
|
|
|
|
|
|
rec
match
|
|
|
|
match
1
1
|
|
|
|
|
|
|
|
rec
match
|
|
|
|
rec
match
|
|
|
|
match
1
1
|
|
|
|
|
|
•
W
Unit
match
0
1
Unit
rec
match
•
W
match
0
2
rec
match
The resulting grammar is given in Figure 6.7, the rules are given in Figure 6.8. Again there are several things to note:
•
W
rec
W
•
W
W
•
Remark 6.6: Equality and Subtyping
There are several subtyping behaviors imaginable for
-types:
W
•
Unit
W
2
match
0
1
W
The intended behavior could be recovered by having
-types carry an additional label for each constructor.
W
•
W
•
W
match
W
match
•
W
0
W
0
This behavior can again be achieved by having
-types carry labels for the constructors, in which case the ambiguity can be resolved by demanding that the labels match for intended subtypes.
W
6.3.1Implementation
From an implementation perspective,
-types are conveniently simple and fall neatly into our pattern of rules. Consequently, implementing them in Mmt is straight-forward; the corresponding Mmt-theory is presented in Listing 6.6.
W
Listing 6.6: Theories for
-types
W
6
From a user perspective however,
-types are incredibly inconvenient and unintuitive, counter to our goal of being as close to mathematical practice as possible. Listing 6.7 shows as an example the natural numbers with inductively defined addition on them.
W
Listing 6.7: Natural Numbers on
-types
W
7
6.3.2Structural Features
We can remedy this inconvenience by providing a structural feature that offers a more convenient and intuitive syntax and elaborates into
-types, giving users a way to specify inductive types without having to use the cumbersome syntax offered by
-types – or having to know about them at all. The goal is to formalize natural numbers more akin to this:
W
W
A structural feature is an extension of the class StructuralFeature. Firstly, it instructs the parser how to parse the header of an instance of this feature (e.g.
induct
[
name
]
=
). For this, the structural feature has to provide a notation. The header is followed by an optional body of a theory (e.g. containing the declarations
Nat
,
Zero
and
Succ
). The parser wraps any application of the header notation into an object of class DerivedDeclaration, holding the components provided in the header and the optional body.
Secondly, after parsing, this DerivedDeclaration is passed on to the StructuralFeature, which returns an Elaboration, holding a list of Mmt declarations computed from the DerivedDeclaration. The details on implementing derived declarations and structural features are sketched in [Ian17] and thus omitted here.
For inductive types, we want two structural features: One that elaborates into a
-Type and its constructors, and one for the elimination form.
The keywords for these features are set as induct and def respectively, which allows us to implement natural numbers and addition in surface syntax as in Listing 6.8.
W
8
Listing 6.8:
-Types with Structural Features
W
9
8
All declarations in an
induct
or
def
-environment are bundled into the body of a theory, which is itself wrapped into a DerivedDeclaration
𝑑
. This declaration is then passed on to the StructuralFeature’s elaborate-method, which in the case of
induct
computes from them the appropriate
-type (e.g. for
N
) and the constructors (e.g.
Zero
and
Succ
), and in the case of
def
the corresponding function defined by a
-expression. We can pass additional parameters to the structural feature in the header of an
induct
or
def
environment, to allow for e.g. polymorphic
-Types or inductive functions with arity
>
1
(as in
addition
), as shown with lists and binary trees in Listing 6.9.
W
rec
W
Listing 6.9: Lists and Trees
10
Remark 6.7: Structural Features for Induction
While we use
-types here as a target for elaborating our structural features, we are in no way required to do so. Colin Rothgang recently developed a similar feature (as well as many other structural features, e.g. for equivalence relations and quotients)
, which elaborates into undefined constants for the inductive principles instead. While this has the disadvantage that the solver can not natively exploit the generated induction principles automatically, it has the advantage that the resulting elaboration works with plain LF and can represent more advanced induction principles such as mutually recursive types, which can not be represented using
-types alone.
W
a
W
Again, we are able to marry the two approaches quite easily while reusing the same structural feature and hence syntactic representation. For example, the feature rule can check for the presence of the
-type rules in the current context and decide accordingly whether to elaborate into
-types or primitive LF constants.
W
W
6.4Cumulative Universe Hierarchies
Plain LF offers two universes
type
and
kind
. With respect to their ontological status, they can be thought of as analogous to the distinction between sets and proper classes. However, often the two universes are somewhat restrictive – PLF is already an extension of this two-tiered hierarchy, but even the shallow polymorphism enabled by it is often not enough, e.g. when wanting to quantify over all groups (each of which has a base type), or sets (for some correspondance between sets and types), or when formalizing categories while trying to avoid a deep embedding.
11
In those situations, it can be more convenient to have
infinitely many universes
𝒰
𝑖
for each
𝑖
∈
N
.
By declaring
𝒰
𝑖
⇐
𝒰
𝑗
and
𝒰
𝑖
<∶
𝒰
𝑗
whenever
𝑖
<
𝑗
, we get an infinite cumulative hierarchy of universes analogous (in a set-theoretic model) to higher levels of the Von-Neumann hierarchy indexed by large cardinals.
12
This is the approach taken by homotopy type theory [Uni13], Mizar [Miz] via Grothendieck universes, and most notably Coq [Tea03], where the precise universe in which a declaration lives is unspecified by the user and computed by the system by solving appropriate constraints on the full current context (floating universes).
The basic rules are straight-forward and given in Figure 6.9, and as with enumeration types (Section 6.2), we use number literals
and postulate some (potentially judgments-as-types based) judgment
𝑖
<
𝑗
.
Nat
To be “backwards compatible” with everything implemented in LF alone, we can redefine
type
∶=
𝒰
0
and
kind
∶=
𝒰
1
. Even though
type
<∶
kind
does not hold in LF, this additional subtyping judgment does not impact plain LF content.
However, in order for
∏
-types to be usable for types living in higher universes, we need to slightly modify the formation rule for
∏
:
Note, that for
𝑈
=
type
, this subsumes the original rule. Since we can declare our new formation rule to shadow the old one, this does not break modularity. In addition, we augment Definition 4.1 by new cases:
13
|
|
|
|
13
Definition 6.2:
•
For •
For |
|
if
|
|
|
if
|
|
undefined
|
otherwise.
|
Remark 6.8: Polymorphism
One interesting aspect to consider is which universe
should live in. In Section 6.2.1, we determined
to be a
type
, hence
⇐
𝒰
0
. Additionally, our formation rule for
∏
was:
Consider a type
𝑇
=
∏
𝑛
∶
∏
𝑈
∶
𝒰
𝑛
𝐵
. Is this type valid?
Nat
Nat
Nat
|
|
|
|
Nat
The answer is no, since the universe that
𝑇
lives in would be
max
{
𝒰
0
,
𝒰
𝑛
}
, which can not be computed since
𝒰
𝑛
depends on a free variable. Hence our rules forbid quantification over all universes.
However, similar as in PLF (see Remark 4.6), we might want to allow shallow polymorphism – i.e. the ability to quantify over all universes on the outside of a
∏
-type only. By declaring such a type to be inhabitable but untyped, we avoid paradoxes resulting from unrestricted quantification, but still allow for implementing polymorphic functions. The corresponding rule would hence be:
|
Nat
|
|
Nat
|
11
12
6.4.1Implementation and Universe Rules
All the rules are again straight-forward, but here we encounter a universe rule for the first time. As with the other rules, we need to implement an apply method for them, the signature of which is rather simple:
and returns true if the term tm is declared to be a universe by this rule.
Example 6.5:
The universe rule for
𝒰
is given in Listing 6.10.
Line 2
makes this rule applicable also to Line 5
merely checks that the argument passed on to Nat
Listing 6.10: The Universe Rule for
𝒰
14
3
def apply(solver: Solver)(tm: Term)(implicit stack: Stack, history: History) : Boolean = tm match {The corresponding theories for the symbols and rules are given in Listing 6.11. Notably, the theory TypedHierarchy which is to serve as a “standalone” theory of a cumulative type hierarchy without any typing features, needs to import several fundamental Mmt theories that validate notions like typing, notations, module expressions etc. in the first place, as well as the theories containing the symbols for
type
and
kind
for compatibility with plain LF theories. Listing 6.12 shows a small example theory using a cumulative hierarchy. Additionally, universes are used all over the Math-in-the-Middle archive
, notably in the algebraic theories and in categories.
15
Listing 6.11: Symbols and Rules
16
Listing 6.12: A Simple Theory for Universes
17
6.5A Logical Framework Based on Homotopy Type Theory
Homotopy Type Theory [Uni13] (HoTT) is a foundation of mathematics built upon a type system that we can now define as LF+Cumulative Universes+
∑
-Types+
⊕
-Type+Finite Types+
-Types. It is based entirely around Propositions-as-Types, as summarized in Figure 6.10 – more precisely, the “propositional” part of homotopy type theory is an intuitionistic higher-order logic: Our proof rules can be easily observed to correspond to the usual introduction and elimination rules of a natural deduction calculus for intuitionistic logic. To get to a classical setting, it suffices to add a polymorphic constant
tnd
∶
∏
𝑖
∶
∏
𝑈
∶
𝒰
𝑖
∏
𝐴
∶
𝑈
𝐴
⊕
(
𝐴
→
∅
)
– which by Propositions-as-Types corresponds to the statement
𝐴
∨
¬𝐴
for all “propositions”
𝐴
.
W
Nat
Formalizing HoTT in a logical framework such as LF is prohibitively difficult; a previous formalization by Florian Rabe is available online and shows the limitations of such an attempt
: Even
∑
-types can only be fully formalized in LF by adding simplification rules (see Remark 5.2) and
-types are left out entirely. Furthermore, since HoTT is – like LF– based on a Martin-Löf type theory, a formalization within LF requires re-implementing that very type theory using a HOAS deep embedding, with all the associated drawbacks for working with and within the resulting formalization. This is contrasted by HoTT’s aim to be a foundation of mathematics alongside and orthogonal to other such frameworks. Consequently, it is much more adequate to lift an implementation of Homotopy Type Theory to the level of logical frameworks itself. Additionally, this makes HoTT an attractive case study for a modular logical framework.
18
W
Homotopy Type Theory is a framework under active investigation and development, and hence frequently subject to change and subtle shifts regarding formal details. The description here follows the presentation in [Uni13], which does not necessarily reflect the current state of the art.
Accordingly, the following definitions and theorems are all taken from the same source and described in much more detail there. They are only contained here for clarification. In particular, we omit all proofs.
6.5.1Identity Types
Notably, in the correpondences in Figure 6.10 we are yet missing (propositional) equalities. For these, we can add a simple equality type
𝑎
≐
𝐴
𝑏
∶
𝑈
whose semantics is provided by the intended propositions-as-types correspondence.
However, HoTT is somewhat peculiar here, since it does not consider propositional types to be proof irrelevant - Equality of elements
𝑎
≐
𝐴
𝑏
is considered a type with arbitrarily many distinct elements, and this notion of equality is distinct from the judgmental equality
𝑎
≡
𝑏
∶
𝐴
. In particular, the existence of a “proof”
𝑝
∶
𝑎
≐
𝐴
𝑏
does not imply
𝑎
≡
𝑏
∶
𝐴
. The reverse implication holds trivially by the introduction form, for which we use reflexivity: For every
𝑎
∶
𝐴
, we have
refl
𝑎
∶
𝑎
≐
𝐴
𝑎
.
For elimination we use a mechanism called (in the context of HoTT) path induction: Given any
𝑝
∶
𝑎
≐
𝐴
𝑏
and
𝑐
(
𝑥
)
∶
𝐶
(
𝑥
)
, we have
ind
𝑝
{
𝑥
,
𝑦
,
𝑞
═
⇒
𝐶
(
𝑥
,
𝑦
,
𝑞
)
𝑐
(
𝑥
)
}
∶
𝐶
(
𝑎
,
𝑏
,
𝑝
)
, where for any
𝑥
∶
𝐴
we demand that
𝑐
(
𝑥
)
∶
𝐶
(
𝑥
,
𝑥
,
refl
𝑥
)
and will declare that
ind
refl
𝑎
{
𝑥
,
𝑦
,
𝑞
═
⇒
𝐶
(
𝑥
,
𝑦
,
𝑞
)
𝑐
(
𝑥
)
}
=
𝑐
(
𝑎
)
. This requires some deliberation:
The central idea is that given a proposition (or type)
𝐶
(
𝑥
)
depending on
𝑥
∶
𝐴
, if we have a proof/element
𝑐
(
𝑎
)
∶
𝐶
(
𝑎
)
and a proof
𝑝
∶
𝑎
≐
𝐴
𝑏
, then we should also have a proof/element
𝑐
(
𝑏
)
∶
𝐶
(
𝑏
)
(congruence of equality). This is what
ind
yields in the simple case where
𝐶
only depends on
𝑥
.
In homotopy type theory, this principle is generalized in that the type
𝐶
may depend not just on
𝑥
, but also on an element
𝑦
considered propositionally (but not necessarily judgmentally) equal to
𝑥
, as well as the proof
𝑝
∶
𝑥
≐
𝐴
𝑦
for that equality. The postulated judgmental equality
ind
refl
𝑎
{
𝑥
,
𝑦
,
𝑞
═
⇒
𝐶
(
𝑥
,
𝑦
,
𝑞
)
𝑐
(
𝑥
)
}
=
𝑐
(
𝑎
)
can then be interpreted as the statement that the family of types
𝑥
≐
𝐴
𝑦
(for fixed
𝐴
and variable
𝑥
,
𝑦
) is inductively defined by the elements of the form
refl
𝑎
, in the sense that any type family
𝐶
(
𝑥
,
𝑦
,
𝑝
)
for
𝑥
,
𝑦
∶
𝐴
and
𝑝
∶
𝑥
≐
𝐴
𝑦
is determined by the cases of the form
𝐶
(
𝑥
,
𝑥
,
refl
𝑥
)
.
Note that this does not imply that the types
𝑎
≐
𝐴
𝑏
(for fixed elements
𝑎
,
𝑏
∶
𝐴
) are themselves inductively defined via the elements
refl
𝑎
– whenever
𝑎
and
𝑏
are not judgmentally equal, this type is not inhabited by a
refl
-term at all.
_
Remark 6.9: Based Path Induction
The above is often inconvenient in situations where we want to use congruence of equality – i.e. when we know
𝑃
(
𝑎
)
holds for a fixed, definite element
𝑎
∶
𝐴
, and given
𝑝
∶
𝑎
≐
𝐴
𝑏
want to infer that
𝑃
(
𝑏
)
holds as well. The reason is that we need to be able to provide an element
𝑐
(
𝑥
)
which for arbitrary
𝑥
∶
𝐴
represents a witness for
𝑃
(
𝑥
)
, which we don’t have in those instances.
The way around this is to instead use path induction on lambda abstractions, which is a common enough situation to warrant a separate name: Based Path Induction. This is definable using general path induction in the following way:
Let
𝑇
𝐶
=
∏
𝑥
∶
𝐴
𝑎
≐
𝐴
𝑥
→
𝒰
and
𝑇
𝐷
=
∏
𝑧
∶
𝐴
𝑥
≐
𝐴
𝑧
→
𝒰
:
|
_
|
|
|
|
|
|
|
|
|
|
|
Example 6.6:
1.
Given the argument
𝑝
∶
𝑎
≐
𝐴
𝑏
, we have
ind
𝑝
{
𝑥
,
𝑦
,
𝑞
═
⇒
𝐶
(
𝑥
,
𝑦
,
𝑞
)
𝑐
(
𝑥
)
}
∶
𝐶
(
𝑎
,
𝑏
,
𝑝
)
, so we need to choose
𝐶
(
𝑥
,
𝑦
,
𝑞
)
=
𝑦
≐
𝐴
𝑥
for
𝐶
(
𝑎
,
𝑏
,
𝑝
)
to be the type
𝑏
≐
𝐴
𝑎
.
The term
𝑐
(
𝑥
)
will have to check against
𝐶
(
𝑥
,
𝑥
,
refl
𝑥
)
, so we can choose
𝑐
(
𝑥
)
∶=
refl
𝑥
.
Hence, we get
ind
𝑝
{
𝑥
,
𝑦
,
𝑞
═
⇒
𝑦
≐
𝐴
𝑥
refl
𝑥
}
∶
𝑏
≐
𝐵
𝑎
, so our function is
symm
∶=
𝜆
𝑎
∶
𝐴
.
𝜆
𝑏
∶
𝐴
.
𝜆
𝑝
∶
𝑎
≐
𝐴
𝑏
.
ind
𝑝
{
𝑥
,
𝑦
,
𝑞
═
⇒
𝑦
≐
𝐴
𝑥
refl
𝑥
}
2.
_
_
The rules for identitiy types are given in Figure 6.11. Note that if we wanted to identify propositional equality
𝑎
≐
𝐴
𝑏
and judgmental equality
𝑎
≡
𝑏
∶
𝐴
, we could add a corresponding equality rule.
6.5.2Equivalences and Univalence
The principal idea behind homotopy type theory is to interpret propositional equality topologically: Given two “points”
𝑎
,
𝑏
∶
𝐴
, the type
𝑎
≐
𝐴
𝑏
is interpreted as the type of all continuous paths between
𝑎
and
𝑏
. This makes sense of the multitude of possible “proofs” of
𝑎
≐
𝐴
𝑏
, since not all continuous paths between points are equal. Two such paths
𝑝
,
𝑞
∶
𝑎
≐
𝐴
𝑏
can be equivalent though, in the sense that there can be a continuous transformation from
𝑝
to
𝑞
- i.e. there can be a path
𝑟
∶
𝑝
≐
(
𝑎
≐
𝐴
𝑏
)
𝑞
- making
𝑟
a homotopy.
Definition 6.3:
Let
𝑓
,
𝑔
∶
∏
𝑥
∶
𝐴
𝐵
(
𝑥
)
. A homotopy from
𝑓
to
𝑔
is a function of type
𝑓
∼
𝑔
∶=
∏
𝑥
∶
𝐴
𝑓
(
𝑥
)
≐
𝐵
(
𝑥
)
𝑔
(
𝑥
)
Theorem 6.1:
Under propositions-as-types, homotopy is an equivalence relation, i.e.there are functions of the following types:
∏
𝑓
∶
∏
𝑥
∶
𝐴
𝐵
(
𝑥
)
𝑓
∼
𝑓
∏
𝑓
,
𝑔
∶
∏
𝑥
∶
𝐴
𝐵
(
𝑥
)
(
𝑓
∼
𝑔
)
→
(
𝑔
∼
𝑓
)
∏
𝑓
,
𝑔
,
ℎ
∶
∏
𝑥
∶
𝐴
𝐵
(
𝑥
)
(
𝑓
∼
𝑔
)
→
(
𝑔
∼
ℎ
)
→
(
𝑓
∼
ℎ
)
We can now use homotopies to define two types as equivalent in the category theoretical manner: A function is an equivalence (isomorphism) if it is invertible (up to homotopy), and two types are equivalent (isomorphic) if there is an equivalence between them:
Definition 6.4:
1.
We call a function 2.
We call two types Theorem 6.2:
As with homotopy, equivalence of types is an equivalence relation.
The final step that Homotopy Type Theory takes is to declare function extensionality (on equivalences) and postulating the univalence axiom that declares equality of types itself an equivalence.
Definition 6.5:
1.
We define a polymorphic function: Nat
2.
We define a polymorphic function Nat
Axiom 6.1: Function Extensionality
For any
𝐴
,
𝐵
,
𝑓
,
𝑔
,
happly
is an equivalence, i.e. we have a constant
∶
isequiv
(
happly
(
𝑛
,
𝐴
,
𝐵
,
𝑓
,
𝑔
))
_
Axiom 6.2: Univalence
For any
𝐴
,
𝐵
,
idtoequiv
is an equivalence, i.e. we have a constant
∶
isequiv
(
idtoequiv
(
𝑛
,
𝐴
,
𝐵
))
In particular, it follows that
(
𝐴
≐
𝒰
𝑛
𝐵
)
≃
(
𝐴
≃
𝐵
)
_
6.5.3Implementation
Implementing the rules for identity types as in Figure 6.11 is straight-forward
. All concepts and axioms of Homotopy Type Theory beyond our previously covered typing features can conveniently be specified in MMT syntax directly, as in Listing 6.13.
19
Listing 6.13: A Theory for HOTT
20
19
Remark 6.10: Higher Inductive Types
One noteworthy aspect in Homotopy Type Theory that is being actively researched is the notion of higher inductive types - types that are inductively defined via elements of equality types. The prototypical example is the unit circle:
Since in HoTT, elements of an equality type can be thought of as a (homotopical) continuous path between two points, one can imagine defining a unit circle as a single point
𝑎
∶
𝑆
1
with a non-trivial path from
𝑎
to
𝑎
; that is an element
𝑝
∶
𝑎
≐
𝑆
1
𝑎
distinct from
refl
𝑎
. Using higher inductive types, the unit circle (as a type)
𝑆
1
is thougt to be inductively defined / freely generated from the two declarations
𝑎
∶
𝑆
1
and
𝑝
∶
𝑎
≐
𝑆
1
𝑎
.
Unfortunately, no formal specification of higher inductive types in general exists yet. However, once such a specification exists, one can use a structural feature as in Remark 6.7 (or even the same feature) to generate higher inductive types. In the mean time, such a feature can easily serve as a tool to test and experiment with various approaches for formalizing higher inductive types.
For any
𝑈
with
Γ
⊢
𝑈
univ
:
Formation:
Introduction:
Elimination:
Equality:
Computation (Optional):
Subtyping (Optional):
|
|
|
Unit
|
|
Unit
|
|
|
|
|
|
Unit
|
|
|
|
|
|
|
|
|
Figure 6.4: Basic Rules for Finite Types
Formation:
Introduction:
(
)
Elimination:
Type Checking (Optional):
Equality: Computation (Optional):
Subtyping (Optional):
|
Nat
|
|
enum
|
|
Nat
Nat
|
|
case
enum
|
|
Nat
|
|
case
enum
|
(pattern match rule)
|
|
|
case
enum
|
|
enum
n
|
|
case
|
|
|
|
enum
enum
|
Figure 6.5: Convenience Rules for Finite Types
Theories for Coproducts
5

Figure 6.6: A Modular Development Graph for Finite Types
|
|
|
|
contexts
|
|
|
|
|
variables and universes
|
|
|
|
W
rec
|
W
|
Figure 6.7: Grammar for Coproduct-Types
For any
𝑈
with
Γ
⊢
𝑈
univ
:
Formation:
Introduction:
Elimination:
|
|
|
W
|
|
W
|
|
W
W
|
For
𝐶
′
∶=
𝐶
[
𝑐
/
sup
𝑐
{
𝑥
═
⇒
𝑥
∶
𝐴
𝐵
𝑔
(
𝑥
)
}
]
:
Computation:
W
|
W
W
W
|
|
rec
|
For
𝑑
∶=
sup
𝑐
′
{
𝑥
═
⇒
𝑊
𝑎
}
,
𝑔
′
∶=
𝜆
𝑥
∶
𝐵
[
𝑥
/
𝑐
′
]
.
𝑎
, and
ℎ
′
∶=
𝜆
𝑦
∶
𝐵
[
𝑥
/
𝑐
′
]
.
(
𝑎
[
𝑥
/
𝑦
]
)
{
(
𝑐
,
𝑔
,
ℎ
)
═
⇒
𝐶
[
𝑐
/
𝑑
]
𝑒
}
:
rec
|
rec
|
|
rec
|
Figure 6.8: Rules for
-Types
W
Universe Rule:
Formation:
Subtyping (Optional):
|
Nat
|
|
|
|
Nat
|
|
|
|
|
|
|
Figure 6.9: Rules for Universes
|
Higher-Order Logic
|
Homotopy Type Theory
|
||
|
|
|||
|
Constant symbols
|
|
constants
|
|
|
Function symbols
|
|
functions
|
|
|
Predicate symbols
|
|
functions
|
|
|
|
|||
|
Propositions
|
|
types
|
|
|
Conjunctions
|
|
Simple Products
|
|
|
Implications
|
|
Simple Function Types
|
|
|
Disjunctions
|
|
Coproducts
|
|
|
Negations
|
|
Function Types
|
|
|
Universal Quantifier
|
|
Dependent Function Types
|
|
|
Existential Quantifier
|
|
Dependent
|
|
|
|
|||
|
Proofs for
|
Elements
|
Figure 6.10: Propositions as Types
For any
𝑈
with
Γ
⊢
𝑈
univ
:
Formation:
Introduction:
Elimination:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Figure 6.11: Rules for Identity Types
Chapter 7
Subtyping
The previously discussed subtyping rules were almost exclusively concerned with variance behavior of type constructors, i.e. rules of the form
𝐶
(
𝑎
)
<∶
𝐶
(
𝑎
′
)
if (depending on the variance)
𝑎
<∶
𝑎
′
or
𝑎
′
<∶
𝑎
. Naturally, in the absence of any additional subtyping rules, these variance rules ultimately reduce to equality rules, since the only way to prove
𝑎
<∶
𝑎
′
(ignoring recursively calling variance rules) is to check
𝑎
≡
𝑎
′
∶
𝐴
. However, some systems allow for primitive subtyping principles of various forms, e.g. by declaring
𝐴
<∶
𝐵
axiomatically for two primitive types
𝐴
,
𝐵
or by introducing predicate subtypes. Systems with subtyping mechanism include e.g. PVS [ORS92], IMPS [FGT93] and – in a certain sense – Coq [Coq15] and Matita [Asp+06a], as we will discuss later. Variance rules are accordingly important for the composite types to behave as intended in the presence of primitive subtyping principles.
Notably, subtyping in general breaks several often desirable properties of a type system, such as terms having a unique type and (potentially) decidability. Consequently, many systems decide to not support any subtyping. While this allows for a convenient implementation and formal analysis of a system, it often leads to awkward formalizations of content which can be implemented very naturally with subtypes. For example, numerical types (natural numbers, integers, real numbers) are often desirable to behave as subtypes. Similarly, types for algebraic structures yield a very natural subtyping hierarchy (e.g. all groups are monoids).
In this chapter, I will discuss various approaches to implementing subtyping principles in Mmt, how and why they are not fully implementable in the system, and speculate on how to improve Mmt to better support subtyping mechanisms.
7.1Issues with Subtyping in Mmt
Mmt supports subtyping in the form of a binary subtyping judgment
𝐴
<∶
𝐵
and rules returning these judgments as conclusion. Notably however, a binary subtyping judgment is insufficient to exploit all deducable subtyping judgments without additional treatment or huge increases in computational cost. A simple example is transitivity:
Example 7.1:
Assume we have
N
,
Z
,
R
∶
type
and subtyping rules proving
N
<∶
Z
and
Z
<∶
R
. Consider a statement
∏
𝑛
∶
N
∏
𝑟
∶
R
𝑛
+
𝑟
, where “
+
” has type
R
→
R
→
R
. In order for this to be well-typed, the system has to prove
Γ
,
𝑛
∶
N
⊢
𝑛
⇐
R
, which in the absence of additional rules would default to checking
N
<∶
R
. We know that this follows from
Z
being an intermediary type between
N
and
R
, but in an implementation, this implies finding a seemingly arbitrary type
𝐶
satisfying two additional (binary) judgments, which is computationally expensive. Hence the solver can not algorithmically conclude the judgment from the above two subtyping rules alone.
This example alone is sufficient to demonstrate that merely checking binary subtyping judgments is not sufficient for covering subtyping mechanisms found in other systems.
One way this can be resolved is by keeping the full subtyping lattice for the current context in memory, and dynamically expanding it whenever the context is extended by additional types. Additionally, we can model subtyping premises as constraints (i.e. lower and upper bounds in the subtyping lattice), and introduce appropriate constraints when encountering unsolved type variables. These can then be solved in a manner similar to the method proposed in [KP93]. Notably, to be able to solve these constraints the full subtyping lattice still needs to be known. The feasibility, potential cost and benefits of this approach will need to be explored.
Additional problems arise whenever it becomes necessary to coerce a term to a refinement of its principal type:
Example 7.2:
Consider
𝑛
,
𝑚
∶
N
odd numbers,
𝑟
1
∶
R
=
𝑛/2
and
𝑟
2
∶
R
=
𝑚/2
, and
𝑆
∶
N
→
N
.
We know then, that
𝑆
(
𝑟
1
+
𝑟
2
)
is a well-typed expression, since
𝑟
1
+
𝑟
2
can be coerced to have type
N
, however,
𝑟
1
+
𝑟
2
will have the inferred type
R
from the types of
+
and the arguments
𝑟
1
,
𝑟
2
- from the point of view of the algorithm, the information that
𝑟
1
+
𝑟
2
is in fact a natural number is lost.
This problem can in principle be solved by introducing an object-level typing-predicate
𝑎
𝐵
with a rule of the form
.
By allowing arbitrary terms for
𝑎
and
𝐵
and quantification on the outside, this corresponds to term declarations [SS89]. Again, how to implement this efficiently and whether doing so is an effective solution remains to be determined.
$
|
_
$
|
|
|
Things get more difficult in the presence of predicate subtyping:
⟨
𝑥
∶
𝐴
|
𝑃
(
𝑥
)
⟩
– the subtype of
𝐴
containing exactly the elements
𝑥
∶
𝐴
for which the predicate
𝑃
(
𝑥
)
holds. Correspondingly, checking
𝑎
∶
⟨
𝑥
∶
𝐴
|
𝑃
(
𝑥
)
⟩
entails proving the proposition
𝑃
(
𝑎
)
, for which a strong prover is (if not required) strongly desirable, and the typing system is immediately undecidable. Additionally, the set of all types for a given term
𝑡
is now a (for all practical purposes) infinite lattice induced by all propositions that hold for
𝑡
and the implications between them.
However, for knowledge management purposes where fully checking the accuracy of content is not strictly necessary, even weak support for computationally difficult features is sufficient for most purposes, especially plain representation. Consequently, in this section we will discuss several subtyping mechanisms that, while not strongly supported by Mmt, are still desirable as purely syntactical features with weak inference support to cover as many situations as possible.
7.2Declared Subtypes and Rule Generators
At its most primitive, a user might want to axiomatically
declare two types
𝐴
,
𝐵
to be subtypes of each other. Naively, one might want to implement this by introducing a new constructor
𝐴
<
∗
𝐵
such that
𝐴
<
∗
𝐵
is inhabitable whenever
𝐴
and
𝐵
are, and adding a simple subtyping rule:
|
_
|
|
|
The disadvantage of this naive approach is the existence of a subtyping rule that a) needs to be applicable for arbitrary types
𝐴
,
𝐵
and hence be potentially called every time a subtyping judgments is checked, and b) defers to the prover to find an element
𝑝
∶
𝐴
<
∗
𝐵
, which is computationally expensive and fails in the majority of cases anyway (e.g. when
𝐴
,
𝐵
are actually equal). Correspondingly, even though adequate, the mere presence of this rule will noticably slow down type checking in general.
To avoid this problem, we can instead implement a change listener, that generates a new subtyping rule for every constant of type
𝐴
<
∗
𝐵
which is only applicable on the specific judgment
𝐴
<∶
𝐵
and always returns true if applicable. Additionally, this introduces the constraint on the user that they can only declare two types as subtypes by introducing a constant of the respective
<
∗
-type, which prohibits subtype judgments as arguments in dependent types. This allows the typing system (barring other factors) to stay decidable. The resulting inference rules are simple and listed in Figure 7.1.
A change listener extends the class ChangeListener and needs to implement the following methods:
•
•
•
Example 7.3:
The change listener generating subtyping rules is given in Listing 7.1.
Line 1
declares a new class SubtypeJudgRule for our generated SubtypingRules for the judgment Line 3
makes such a rule applicable for the above judgement only, in which case Line 4/5
simply returns true for that judgement. The actual ChangeListener starts in Line 8.
Line 12
Given a constant with path Line 16
attempts to map a constant Lines 31–33
makes sure to remove the RuleConstant whenever the generating constant is removed. Lines 35–47
contain the core method of the change listener. It acts whenever a new constant is added (Line 36) whose type is of the form Listing 7.1: A Change Listener Generating Subtyping Rules
1
4
def apply(solver: Solver)(tp1: Term, tp2: Term)(implicit stack: Stack, history: History): Option[Boolean]Remark 7.1:
A ChangeListener is an Extension that needs to be explicitly registered with the Mmt system. As such it can not be activated by a rule <path>-statement. Instead, the command extension <fully−qualified−classpath> needs to be called before handling any content which depends on the ChangeListener. This is because change listeners have access to Mmt’s central components (backend, library, server, other extensions) and hence can change the global state of the system, which mere content should not be allowed to do.
7.3Intersection Types
In [Pfe93], the authors present an extension of LF by refinement types (sorts) and
intersection types, which introduce a weak subtyping principle while retaining decidability. The following canonical example illustrates the usefulness of these intersection types:
Example 7.4:
Assume a type
N
of natural numbers with two declared subtypes (refinement types)
Even
and
Odd
, representing even and odd natural numbers, respectively. Naturally, the successor function
𝑆
has type
N
→
N
, however, we also know that it will map odd numbers to even numbers and vice versa – hence
𝑆
can also be implemented with the types
Odd
→
Even
and
Even
→
Odd
, both of which are refinements of the type
N
→
N
.
Intersection types allow to encode this additional typing information by giving
𝑆
all three types at once, or more precisely: The intersection type of all three:
𝑆
∶
(
Odd
→
Even
)
⊓
(
Even
→
Odd
)
⊓
(
N
→
N
)
Remark 7.2:
∏
-Elimination
As the above example suggests, intersection types are primarily useful for refining function types by encoding additional information on the return types based on the (refinement) types of the inputs. Note that this breaks the up until now valid assumptions that every function has a principal
∏
-type. In fact, the basic rules for LF (to be precise: the elimination rule for function application) assume that the inferred type of every function is equal to a
∏
-type.
[Pfe93] deals primarily with refinement types, which can be thought of as sorts of a specific (maximal) type. Two refinement types can only be compared if they refine the same maximal type. This allows for retaining decidability and keep the original elimination rule for function applications (under the additional rule that
(
∏
𝑥
∶
𝐴
𝐵
)
⊓
(
∏
𝑥
∶
𝐴
𝐶
)
<∶
∏
𝑥
∶
𝐴
𝐵
⊓
𝐶
), but already excludes Example 7.4.
Thankfully, Mmt allows for rules to shadow other rules – we can hence implement a new elimination rule for LF with intersection types which subsumes the primtive LF rule. In the presence of this new rule, the old rule is “deactivated”, thus preserving modularity.
Remark 7.3: Inhabitants of Intersection Types
Another problem with intersection types in a logical framework is to inhabit them with
𝜆
-terms. Consider the following example:
To check the definiens of
succ2
against its type, the
𝜆
-bound variable
x
needs to be either explicitly typed, or its type must be inferred from the remainder of the definiens. In both cases we run into the problem that checking the
𝜆
-expression against the three distinct intersected function types requires typing
x
differently each time.
[Sto+17] attempts to solve this by introducing a dedicated introduction form
⟨
𝑎
,
𝑏
⟩
for intersection types
𝐴
⊓
𝐵
, where it is required that
𝑎
⇐
𝐴
,
𝑏
⇐
𝐵
and
𝑎
and
𝑏
have the same shape (roughly meaning: the same syntax tree disregarding types of bounds variables). The idea being that the definiens for
succ2
would have to be
⟨
[x:Even] S S x, [x:Odd] S S x, [x:Nat] S S x
⟩
. While this solves the problem in theory, it is in practice inconvenient for users.
The basic rules for intersection types are given in Figure 7.2.
Remark 7.4: Lambda-Expressions in Intersection Types
We can attempt to simplify the introduction form in Remark 7.3 for users by supplying additional rules. This is necessarily computationally expensive, since every conjunct in an intersection needs to be checked individually for each application of such a function, and hence not an ideal solution. Consider the following expression:
where
E
and
O
abbreviate
Even
and
Odd
, respectively. Then we can supply the
⟨
⋅
⟩
-Operator with rules that either duplicate the lambda-expression with different types for the bound variables (as in Remark 7.3), or alternatively type the variables
𝑛
,
𝑚
with appropriate
⊕
-types and automatically insert
⋅
↪
ℓ
⋅
or
⋅
𝑟
↩
⋅
in any application of a
⟨
𝜆
⟩
-expression. Either case would retain the formal correctness of the approach in [Sto+17] without forcing users to provide multiple lambda expressions of the same shape, and require merely a new inference and computation rule for LF-function applications.
|
|
|
|
|
|
|
|
A rudimentary form of intersection types has been implemented in Mmt
. Notably, intersection types as described above still have the drawback that a user has to provide all the intersected types when declaring a new constant, without being able to refine afterwards. For example, the successor function as in Remark 7.3 would require the types
Odd
and
Even
to be declared beforehand, which is counter to the usual order of declarations when implementing natural numbers.
2
Term declarations as mentioned above could potentially be used to add additional refinement types to a constant after its declaration.
7.4Predicate Subtypes
The type
⟨
𝑥
∶
𝐴
|
𝑃
(
𝑥
)
⟩
corresponds to the subtype of
𝐴
containing exactly those elements
𝑥
∶
𝐴
for which the predicate
𝑃
(
𝑥
)
holds. In the presence of Judgments-as-Types, this corresponds to the type of elements
𝑥
∶
𝐴
for which the dependent type
𝑃
(
𝑥
)
is inhabited – i.e. checking a term
𝑡
against the type
⟨
𝑥
∶
𝐴
|
𝑃
(
𝑥
)
⟩
entails checking
𝑡
∶
𝐴
and the existence of a witness
𝑝
∶
𝑃
(
𝑡
)
.
Given an element
𝑎
∶
⟨
𝑥
∶
𝐴
|
𝑃
(
𝑥
)
⟩
, we should be able to obtain a witness
𝑝
∶
𝑃
(
𝑎
)
, hence we add an additional symbol
predOf
(
𝑎
)
∶
𝑃
(
𝑎
)
.
The remaining rules can be easily derived from this intended semantics:
For any
𝑈
with
Γ
⊢
𝑈
univ
:
Formation:
|
|
|
|
Type Checking:
|
_
|
|
|
Elimination:
|
|
|
|
Subtyping:
|
|
|
|
|
_
|
|
|
Figure 7.3: Rules for Predicate Subtypes
Remark 7.5:
Note that the second subtyping rule is immediately derivable from the type checking rule. Furthermore, either subtyping rule implies
⟨
𝑥
∶
𝐴
|
𝑃
(
𝑥
)
⟩
<∶
⟨
𝑦
∶
𝐴
|
𝑄
(
𝑦
)
⟩
iff for any
𝑧
∶
𝐴
the proposition
𝑃
(
𝑧
)
implies
𝑄
(
𝑧
)
– i.e. if there is a function
𝑓
∶
∏
𝑧
∶
𝐴
𝑃
(
𝑧
)
→
𝑄
(
𝑧
)
.
Remark 7.6:
As mentioned in Remark 5.1, predicate subtyping is on paper subsumed by
∑
-types. The difference in an implementation is that
∑
-types necessitate inserting coercions between the types
𝐴
and
∑
𝑥
∶
𝐴
𝑃
; by pairing an element
𝑎
∶
𝐴
with a witness
𝑝
∶
𝑃
(
𝑎
)
in the one direction, and by the projection
𝜋
ℓ
(
⋅
)
in the other. If these coercions can be inserted automatically and hence left implicit, we can define
predOf
(
𝑎
)
∶=
𝜋
𝑟
(
𝑎
)
and all rules in Figure 7.3 are derivable for
∑
-types. This approach is taken e.g. in Matita and enabled in Coq via the
Program
package
, which is intended to mimic the subtyping behavior of PVS.
a
The rudimentary implementation for predicate subtypes
in Mmt based on these rules is used e.g. for our formalization of the PVS logic, described in Chapter 10.
3
Chapter 8
Record Types and Models
Disclaimer:
The contents of this chapter have been previously published as [MRK18] and [MRK] with coauthors Florian Rabe and Michael Kohlhase. Regarding individual contributions, both the research and the original writing have been done in close collaboration, hence a strict seperation of contributions among the authors is impossible in retrospect. The writing has been revised for this thesis.
Additionally, the implementation and formulation of the rules for record types are my contribution, under supervision (and with help) by Florian Rabe.
In the area of formal systems like type theories, logics, and specification and programming languages, various language features have been studied that allow for inheritance and modularity, e.g., theories, classes, contexts, and records. They all share the motivation of grouping a list of declarations into a new entity such as in
𝑅
=
⟦
𝑥
1
∶
𝐴
1
,
...
,
𝑥
𝑛
∶
𝐴
𝑛
⟧
. The basic intuition behind it is that
𝑅
behaves like a product type whose values are of the form
⟬
𝑥
1
∶
𝐴
1
∶=
𝑎
1
,
...
,
𝑥
𝑛
∶
𝐴
𝑛
∶=
𝑎
𝑛
⟭
. Such constructs are indispensable already for elementary applications such as defining the algebraic structure of Semilattices (as in Figure 8.1), which we will use as a running example.
Many systems support stratified grouping (where the language is divided into a lower level for the base language and a higher level that introduces the grouping constructs) or integrated grouping (where the grouping construct is one out of many type-forming operations without distinguished ontological status), or both. The names of the grouping constructs vary between systems, and we will call them theories and records in this chapter. An overview of some representative examples is given in Figure 8.2.
The two approaches have different advantages. Stratified grouping permits a separation of concerns between the core language and the module system. It also captures high-level structure well in a way that is easy to manage and discover in large libraries, closely related to the advantages of the little theories approach [FGT92a]. But integrated grouping allows applying base language operations (such as quantification or tactics) to the grouping constructs. For this reason, the (relatively simple) stratified Coq module system is disregarded in favor of records in major developments such as [Mat].
Allowing both features can lead to a duplication of work where the same hierarchy is formalized once using theories and once using records. A compromise solution is common in object-oriented programming languages, where classes behave very much like stratified grouping but are at the same time normal types of the type system. We call this internalizing the higher level features. While combining advantages of stratified and integrated grouping, internalizing is a very heavyweight type system feature: stratified grouping does not change the type system at all, and integrated grouping can be easily added to or removed from a type system, but internalization adds a very complex type system feature from the get-go. It has not been applied much to logics and similar formal systems: the only example we are aware of is the FoCaLiZe [Har+12] system.A much weaker form of internalization is used in OBJ and related systems based on stratified grouping: here theories may be used as (and only as) the types of parameters of parametric theories. Most similarly to our approach, OCaml’s first-class modules internalize the theory (called module type in OCaml)
𝑀
as the type
module
𝑀
; contrary to both OO-languages and our approach, this kind of internalization is in addition and unrelated to integrated grouping.
In any case, because theories usually allow for advanced declarations like imports, definitions, and notations, as well as extra-logical declarations, systematically internalizing theories requires a correspondingly expressive integrated grouping construct. Records with defined fields are comparatively rare; e.g., present in [Luo09] and OO-languages. Similarly, imports between record types and/or record terms are featured only sporadically, e.g., in Nuprl [Con+86], maybe even as an afterthought only.
Finally, note a related trade-off that is orthogonal to our development: even after choosing either a theory or a record to define grouping, many systems still offer a choice whether a declaration becomes a parameter or a field. See [SW11] for a discussion.
|
|
||
|
|
Name of feature
|
|
|
System
|
stratified
|
integrated
|
|
|
||
|
ML
|
signature/module
|
record
|
|
C++
|
class
|
class, struct
|
|
Java
|
class
|
class
|
|
Idris [Bra13]
|
module
|
record
|
|
|
||
|
Coq [Coq15]
|
module
|
record
|
|
HOL Light [Har96]
|
ML signatures
|
records
|
|
Isabelle [Wen09]
|
theory, locale
|
record
|
|
Mizar [Miz]
|
article
|
structure
|
|
PVS [ORS92]
|
theory
|
record
|
|
OBJ [Gog+93]
|
theory
|
|
|
FoCaLiZe [Har+12]
|
species
|
record
|
|
|
Figure 8.2: Stratified and Integrated Groupings in Various Systems
This chapter presents the first formal system that systematically internalizes theories into record types
. The central idea is to use an operator
Mod
that turns the theory
𝑇
into the type
Mod
(
𝑇
)
, which behaves like a record type. We take special care not to naively compute this record type, which would not scale well to the common situations where theories with hundreds of declarations or more are used. Instead, we introduce record types that allow for defined fields and merging so that
Mod
(
𝑇
)
preserves the structure of
𝑇
.
1
This approach combines the advantages of stratified and integrated grouping in a lightweight language feature that is orthogonal to and can be easily combined with other foundational language features. Concretely, it is realized as a module in the Mmt framework. By combining our new modules with existing ones, we obtain many formal systems with internalized theories. In particular, our typing rules conform to the abstractions of Mmt so that Mmt’s type reconstruction [Rab17a] is immediately applicable to our features.
8.1Groupings in Formal Languages
Languages can differ substantially in the syntax and semantics of these constructs. Our interest here is in one difference in particular, which we call the difference between stratified and integrated grouping.
8.1.1Analysis
With stratified grouping, the language is divided into a lower level for the base language and a higher level that introduces the grouping constructs. For example, the SML module system is stratified: it uses a simply typed
𝜆
-calculus at the lower level and signatures for the type-like and structures for the value-like grouping constructs at the higher level. Critically, the higher level constructs are not valid objects at the lower level: even though signatures behave similarly to types, they are not types of the base language. With integrated grouping, only one level exists: the grouping construct is one out of many type-forming operations of the base language with no distinguished ontological status. For example, SML also provides record types as a grouping construct that is integrated with the type system.
Stratified languages have the advantage that they can be designed in a way that yields a conservativity property: all higher level features can be seen as abbreviations that can be compiled into the base language. This corresponds to a typical historical progression where a simple base language is designed first and studied theoretically (e.g., the simply-typed
𝜆
-calculus) and grouping is added later when practical applications demand it. But they have the disadvantage that they tend towards a duplication of features: many operations of the lower level are also desirable at the higher level. For example, SML functors are essentially functions whose domain and codomain are signatures, a duplication of the function types that already exist in the base language. In logics, this problem is even more severe because quantification and equality (and eventually tactics, decision procedures etc.) quickly become desirable at the higher level as well, at which point a duplication of features tends to become infeasible. A well-known example of this trap is the stratified Coq module system (inspired by SML), which practitioners often dismiss in favor of using record types, most importantly in the mathematical components project [Mat].
This may lead us to believe that record types are the way to go — but this is not ideal either. Record types usually do not support advanced declarations like imports, definitions, and notations, which are commonplace in stratified languages and indispensable in practice. Depending on the system, record types may also forbid some declarations such as type declarations (which would require a higher universe to hold the record type), dependencies between declarations (which would require dependent types), and axioms (which do not fit the record paradigm in systems that do not use a propositions-as-types design). And complex definition principles such as for inductive types and recursive functions are often placed into a stratified higher level just to handle their inherent difficulty. Moreover, stratified grouping has proved very appropriate for organizing extra-logical declarations such as prover instructions (e.g., tactics, rewrite rules, unification hints) examples, sectioning, comments, and metadata. While some systems use files as a simple, implicit higher level grouping construct, most systems use an explicit one. The exalted status of higher level grouping also often supports documentation and readability because it makes the large-scale structure of a development explicit and obvious. This is particularly helpful when formalizing software specifications or mathematical theories, whose structure naturally corresponds to those offered by higher-level grouping. In our work on integrating theorem prover libraries (see Part III), my colleagues at KWARC and me have experienced that this correspondence makes it much easier to compare and integrate different stratified formalizations of the same concepts.
For an in-depth discussion of stratified and integrated groupings in various languages, see [MRK].
8.2Record Types with Defined Fields
We now introduce record types as an additional module of our framework. The basic intuition is that
⟦
Γ
⟧
and
⟬
Γ
⟭
construct record types and terms. We call a context fully typed resp. defined if all fields have a type resp. a definition. In
⟦
Γ
⟧
,
Γ
must be fully typed and may additionally contain defined fields. In
⟬
Γ
⟭
,
Γ
must be fully defined; the types are optional and usually omitted in practice.
Because we frequently need fully defined contexts, we introduce a notational convention for them: a context denoted by a lower case letters like
𝛾
is always fully defined. In contrast, a context denoted by an upper case letter like
Γ
may have any number of types or definitions.
We extend our grammar as in Figure 8.3.
Remark 8.1: Field Names and Substitution in Records
Note that we use the same identifiers for variables in contexts and fields in records. This allows reusing results about contexts when reasoning about and implementing records. In particular, it immediately makes our records dependent, i.e., both in a record type and — maybe surprisingly — in a record term every variable
𝑥
may occur in subsequent fields. In some sense, this makes
𝑥
bound in those fields. However, record types are critically different from
∑
-types: we must be able to use
𝑥
in record projections, i.e.,
𝑥
can not be subject to
𝛼
-renaming.
As a consequence, capture-avoiding substitution is not always possible. This is a well-known problem that is usually remedied by allowing every record to declare a name for itself (e.g., the keyword
this
in many object-oriented languages), which is used to disambiguate between record fields and a variable in the surrounding context (or fields in a surrounding record). We gloss over this complication here and simply make substitution a partial function.
Before stating the rules, we introduce a few critical auxiliary definition:
Definition 8.1: Substituting in a Record
We extend substitution
𝑡
[
𝑥
/
𝑡
′
]
to records:
•
|
if
|
|
else
|
•
|
if
|
|
else
|
•
Definition 8.2: Substituting with a Record
We write
𝑡
[
𝑟
/
∆
]
for the result of substituting any occurrence of a variable
𝑥
declared in
∆
with
𝑟
.
𝑥
In the special case where
𝑟
=
⟬
𝛿
⟭
, we simply write
𝑡
[
𝛿
]
for
𝑡
[
⟬
𝛿
⟭
/
𝛿
]
, i.e., we have
𝑡
[
𝑥
1
∶=
𝑡
1
,
...
,
𝑥
𝑛
∶=
𝑡
𝑛
]
=
𝑡
[
𝑥
𝑛
/
𝑡
𝑛
]
...
[
𝑥
1
/
𝑡
1
]
.
Our rules for records are given in Figure 8.4. We remark on a few subtleties below.
•
•
•
2
•
•
•
•
Remark 8.2: Horizontal Subtyping and Equality
As mentioned in Remark 4.3, Mmt’s equality judgment could alternatively be formulated as an untyped equality
𝑡
≡
𝑡
′
. For our presentation of record types, however, the use of typed equality is critical.
For example, consider record values
𝑟
1
=
⟬
𝑎
∶=
1
,
𝑏
∶=
1
⟭
and
𝑟
2
=
⟬
𝑎
∶=
1
,
𝑏
∶=
2
⟭
as well as record types
𝑅
=
⟦
𝑎
∶
𝑛𝑎𝑡
⟧
and
𝑆
=
⟦
𝑎
∶
𝑛𝑎𝑡
,
𝑏
∶
𝑛𝑎𝑡
⟧
. Due to horizontal subtyping
, we have
𝑆
<∶
𝑅
and thus both
𝑟
𝑖
⇐
𝑆
and
𝑟
𝑖
⇐
𝑅
. This has the advantage that the function
𝑆
→
𝑅
that throws away the field
𝑏
becomes the identity operation. Now our equality at record types behaves accordingly and checks only for the equality of those fields required by the type. Thus,
𝑟
1
≡
𝑟
2
∶
𝑅
is true whereas
𝑟
1
≡
𝑟
2
∶
𝑆
is false, i.e., the equality of two terms may depend on the type at which they are compared. While seemingly dangerous, this makes sense intuitively:
𝑟
1
can be replaced with
𝑟
2
in any context that expects an object of type
𝑅
because in such a context the field
𝑏
, where
𝑟
1
and
𝑟
2
differ, is inaccessible.
a
Of course, this treatment of equality precludes downcasts: an operation that casts the equal terms
𝑟
1
∶
𝑅
and
𝑟
2
∶
𝑅
into the corresponding unequal terms of type
𝑆
would be inconsistent. But such downcasts are still possible (and valuable) at the meta-level. For example, a tactic
𝐺𝑟𝑜𝑢𝑝𝑆𝑖𝑚𝑝
(
𝐺
,
𝑥
)
that simplifies terms
𝑥
in a group
𝐺
can check if
𝐺
is commutative and in that case apply more simplification operations.
We will sometimes omit a type in the formal presentation of rules if convenient. The implementation in Mmt allows for equality rules to have an optional argument for the type at which they operate, hence both typed an untyped equalities can be treated by the system.
2
Example 8.1:
Figure 8.5 shows a record type of Semilattices (actually, this is a
kind
because it contains a
type
field) analogous to the grouping in Figure 8.1 (using the usual encoding of axioms via judgments-as-types and higher-order abstract syntax for first-order logic).
|
|
|
|
|
|
|
|
|
|
|
DED
|
|
|
|
|
|
|
|
|
Then, given a record
𝑟
∶
Semilattice
, we can form the record projection
𝑟
.
∧
, which has type
𝑟
.
𝑈
→
𝑟
.
𝑈
→
𝑟
.
𝑈
and
𝑟
.
assoc
yields a proof that
𝑟
.
∧
is associative. The intersection on sets forms a semilattice so (assuming we have proofs
∩−
assoc
,
∩
−
comm
,
∩
−
idem
with the corresponding types) we can give an instance of that type as
interSL
∶
Semilattice
∶=
⟬
𝑈
∶=
Set
,
∧
∶=
∩
,
assoc
∶=
∩
−
assoc
,
...
⟭
Theorem 8.1: Principal Types
Our inference rules (in plain LF) infer a principal type for each well-typed normal term.
Proof.Let
Γ
be a fixed well-typed context. We need to show that for any normal expression
𝑡
the inferred type is the most specific one, meaning if
Γ
⊢
𝑡
⇒
𝑇
, then for any
𝑇
′
with
Γ
⊢
𝑡
⇐
𝑇
′
we have
Γ
⊢
𝑇
<∶
𝑇
′
.
If the only type checking rule applicable to a term
𝑡
is an inference rule, then the only way for
𝑡
to check against a type
𝑇
′
which is not the inferred type
𝑇
is by first inferring
𝑇
and then checking
Γ
⊢
𝑇
<∶
𝑇
′
, so in these cases the claim follows by default.
By induction on the grammar:
Consider the type checking rule for records in Figure 8.4 and let
𝑥
∶
𝑇
[
∶=
𝑑
]
∈
∆
′
. Since
Γ
⊢
𝑡
⇐
⟦
∆
′
⟧
, we have
Γ
⊢
𝑟
.
𝑥
⇐
𝑇
(and if
𝑥
is defined in
∆
′
also
Γ
⊢
𝑡
.
𝑥
≡
𝑑
∶
𝑇
) and since
∆
is inferred from
𝑡
we have
𝑥
∶
𝑇
′
∶=
𝑑
in
∆
, where by hypothesis
𝑇
′
is the principal type of
𝑑
and hence
Γ
⊢
𝑇
′
<∶
𝑇
.
As a result,
Γ
,
𝑟
∶
⟦
∆
⟧
⊢
𝑟
.
𝑥
⇒
𝑇
′
, therefore
Γ
,
𝑟
∶
⟦
∆
⟧
⊢
𝑟
.
𝑥
⇐
𝑇
and
Γ
,
𝑟
∶
⟦
∆
⟧
⊢
𝑟
.
𝑥
≡
𝑑
∶
𝑇
and hence
Γ
,
𝑟
∶
⟦
∆
⟧
⊢
𝑟
⇐
⟦
∆
′
⟧
, which makes
⟦
∆
⟧
the principal type of
𝑡
.
In analogy to function types, we can derive the subtyping properties of record types. We introduce context subsumption and then combine horizontal and vertical subtyping in a single statement.
Intuitively,
∆
1
↪
∆
2
means that everything of
∆
1
is also in
∆
2
. That yields:
Theorem 8.2: Record Subtyping
The following rule is derivable:
|
|
|
|
Proof.Assume
Γ
⊢
∆
1
↪
∆
2
. We need to show
Γ
,
𝑟
∶
⟦
∆
2
⟧
⊢
𝑟
⇐
⟦
∆
1
⟧
. By the type checking rule in Figure 8.4, for any
𝑥
∶
𝑇
[
∶=
𝑡
]
∈
∆
1
, we need to show that
Γ
,
𝑟
∶
⟦
∆
2
⟧
⊢
𝑟
.
𝑥
⇐
𝑇
(and if applicable
Γ
,
𝑟
∶
⟦
∆
2
⟧
⊢
𝑟
.
𝑥
≡
𝑡
∶
𝑇
).
By definition of
∆
1
↪
∆
2
, since
𝑥
∶
𝑇
[
∶=
𝑡
]
∈
∆
1
, we have
𝑥
∶
𝑇
′
[
∶=
𝑡
]
∈
∆
2
and
Γ
⊢
𝑇
′
<∶
𝑇
, and if
𝑥
is defined in
∆
1
the required equality holds as well, so the type checking rule proves
Γ
,
𝑟
∶
⟦
∆
2
⟧
⊢
𝑟
⇐
⟦
∆
1
⟧
and the claim follows.
Merging RecordsWe introduce an advanced operation on records, which proves critical for both convenience and performance: Theories can easily become very large containing hundreds or even thousands of declarations. If we want to treat theories as record types, we need to be able to build big records from smaller ones without exploding them into long lists. Therefore, we introduce an explicit merge operator
+
on both record types and terms.
In the grammar, this is a single production for terms:
𝑇
∶∶=
𝑇
+
𝑇
The intended meaning of
+
is given by the following definition:
Definition 8.4: Merging Contexts
Given a context
∆
and a (not necessarily well-typed) context
𝐸
, we define a partial function
∆
∓
𝐸
as follows:
•
•
If –
if –
if ∗
if a variable in ∗
if 3
∗
otherwise, Note that
∓
is an asymmetric operator: While
∆
must be well-typed (relative to some ambient context),
𝐸
may refer to the names of
∆
and is therefore not necessarily well-typed on its own.
We do not define the semantics of
+
via inference and checking rules. Instead, we give equality rules that directly expand
+
into
∓
when possible:
|
|
|
|
|
|
|
|
|
|
|
|
In practice, we want to avoid using the computation rules for
+
whenever possible. Therefore, we prove admissible rules (i.e., rules that can be added without changing the set of derivable judgments) that we use preferentially:
3
Theorem 8.3:
•
If •
If Proof.
𝑅
1
=
⟦
∆
1
⟧
and
𝑅
2
=
⟦
∆
2
⟧
are well-typed record types, then
𝑅
1
+
𝑅
2
↝
⟦
∆
1
∓
∆
2
⟧
. By definition of
∓
, any
𝑟
∶
𝑅
1
+
𝑅
2
hence has all fields defined that are required by both
⟦
∆
1
⟧
and
⟦
∆
2
⟧
and the other way around.
•
If •
By the rules for As can easily be verified, it follows that
∆
1
∓
∆
2
↪
𝛿
1
∓
𝛿
2
, and hence
⟬
𝛿
1
∓
𝛿
2
⟭
⇒
⟦
𝛿
1
∓
𝛿
2
⟧
<∶
⟦
∆
1
∓
∆
2
⟧
=
𝑅
1
+
𝑅
2
.
Inspecting the type checking rule in Figure 8.4, we see that a record
𝑟
of type
⟦
∆
⟧
must repeat all defined fields of
∆
. This makes sense conceptually but would be a major inconvenience in practice. The merging operator solves this problem elegantly as we see in the following example:
Example 8.2:
Continuing our running example, we can now define a type of semilattices with order (and all associated axioms) as in Figure 8.6.
|
|
|
|
|
|
|
DED
(proof)
|
|
|
Now the explicit merging in the type
SemilatticeOrder
allows the projection
interSLO
.
≤
, which is equal to
𝜆𝑥
,
𝑦
∶
(
interSLO
.
𝑈
)
.
(
𝑥
≐
𝑥
(
interSLO
.
∧
)
𝑦
)
and
interSLO
.
refl
yields a proof that this order is reflexive – without needing to define the order or prove the axiom anew for the specific instance
interSL
.
8.3Internalizing Theories
For the purposes of this section, let
Θ
be the global context of available modules.
We can now add the internalization operator, for which everything so far was preparation. We add one production to the grammar:
𝑇
∶∶=
Mod
(
𝑋
)
The intended meaning of
Mod
(
𝑋
)
is that it turns a theory
𝑋
into a record type and a morphism
𝑋
∶
𝑃
→
𝑄
into a function
Mod
(
𝑄
)
→
Mod
(
𝑃
)
. For simplicity, we only state the rules for the case where all include declarations are at the beginning of theory/morphism:
where we use the following abbreviations:
|
flat
defined
|
|
|
|
flat
|
|
|
•
In the rule for theories, •
In the rule for morphisms, In the rule for morphisms, the occurrence of
Mod
(
𝑃
)
may appear redundant; but it is critical to (i) make sure all defined declarations of
𝑃
are part of the record and (ii) provide the expected types for checking the declarations in
𝛿
.
Example 8.3:
Consider the theories in Figure 8.7. Applying
Mod
(
⋅
)
to these theories yields exactly the record types of the same name introduced in Section 8.2 (Figures 8.5 and 8.6), i.e., we have
interSL
⇐
Mod
(
Semilattice
)
and
interSLO
⇐
Mod
(
SemilatticeOrder
)
. In particularly,
Mod
preserves the modular structure of the theory.
|
|
|
|
|
||||||||||
|
|
|
|
|
The basic properties of
Mod
(
𝑋
)
are collected in the following theorem:
Theorem 8.4: Functoriality
•
if •
if •
if Proof.
Mod
(
𝑃
)
.
•
Follows immediately by the computation rule for •
Follows immediately by the computation rule for •
Follows immediately by the computation rule for An immediate advantage of
Mod
(
⋅
)
is that we can now use the expression level to define expression-like theory level operations. As an example, we consider the intersection
𝑃
∩
𝑃
′
of two theories, i.e., the theory that includes all theories included by both
𝑃
and
𝑃
′
. Instead of defining it at the theory level, which would begin a slippery slope of adding more and more theory level operations, we can simply build it at the expression level:
𝑃
∩
𝑃
′
∶=
Mod
(
𝑄
1
)
+
...
+
Mod
(
𝑄
𝑛
)
where the
𝑄
𝑖
are all theories included into both
𝑃
and
𝑃
′
.
4
Note that the computation rules for
Mod
are efficient in the sense that the structure of the theory level is preserved. In particular, we do not flatten theories and morphisms into flat contexts, which would be a huge blow-up for big theories.
5
However, efficiently creating the internalization is not enough.
Mod
(
𝑋
)
is defined via
+
, which is itself only an abbreviation whose expansion amounts to flattening. Therefore, we establish admissible rules that allow working with internalizations efficiently, i.e., without computing the expansion of
+
:
4
5
Theorem 8.5:
Fix well-typed
Θ
,
Γ
and
𝑃
=
{
include
𝑃
1
,
...
,
include
𝑃
𝑛
,
∆
}
in
Θ
. Then the following rules are admissible:
where
[
𝑟
/
𝑃
]
abbreviates the substitution that replaces every
𝑥
declared in a theory transitively-included into
𝑃
with
𝑟
.
𝑥
.
|
|
|
|
|
The first rule in Theorem 8.5 uses the modular structure of
𝑃
to check
𝑟
at type
Mod
(
𝑃
)
. If
𝑟
is of the form
⟬
𝛿
⟭
, this is no faster than flattening
Mod
(
𝑃
)
all the way. But in the typical case where
𝑟
is also formed modularly using a similar structure as
𝑃
, this can be much faster. The second rule performs the corresponding type inference for an element of
Mod
(
𝑃
)
that is formed following the modular structure of
𝑃
. In both cases, the last premise is again only needed to make sure that
𝑟
does not contain ill-typed fields not required by
Mod
(
𝑃
)
. Also note that if we think of
Mod
(
𝑃
)
as a colimit and of elements of
Mod
(
𝑃
)
as morphisms out of
𝑃
, then the second rule corresponds to the construction of the universal morphisms out of the colimit.
Example 8.4:
We continue Example 8.3 and assume we have already checked
interSL
⇐
Mod
(
Semilattice
)
(*).
We want to check
interSL
+
⟬
𝛿
⟭
⇐
Mod
(
SemilatticeOrder
)
. Applying the first rule of Theorem 8.5 reduces this to multiple premises, the first one of which is (*) and can thus be discharged without inspecting
interSL
.
5
Example 8.4 is still somewhat artificial because the involved theories are so small. But the effect pays off enormously on larger theories.
Additionally, we can explicitly allow views within theories into the current theory. Specifically, given a theory
𝑇
, we allow a view
𝑇
=
{
...
,
𝑉
∶
𝑇
′
→
⋅
=
{
...
}
,
...
}
the codomain of which is the containing theory
𝑇
(up to the point where
𝑉
is declared). This view induces a view
𝑇/𝑉
∶
𝑇
′
→
𝑇
in the top-level context
Θ
, but importantly, within
𝑇
(and its corresponding inner context
Γ
) every variable in
𝑉
is defined via a valid term in
Γ
. Correspondingly,
Mod
(
𝑉
)
is – in context
Γ
– a constant function
Mod
(
𝑇
)
→
Mod
(
𝑇
′
)
which we can consider as an element of
Mod
(
𝑇
)
directly.
This allows for conveniently building instances of
Mod
(
⋅
)
and all checking for well-formedness is reduced to structurally checking the view to be well-formed, effectively carrying over all efficiency advantages of structure checking and modular development of theories/views:
Theorem 8.6:
Let
Θ
,
Γ
be well-formed and
𝑇/𝑉
∶
𝑇
′
→
𝑇
=
{
...
}
in
Θ
, where
Γ
is the current context within theory
𝑇
containing
𝑉
∶
𝑇
′
→
⋅
=
{
...
}
. The following rule is admissible:
|
|
Proof.We consider
Mod
(
𝑉
)
an abbreviation for
Mod
(
𝑇/𝑉
)
(
⟬
⋅
⟭
)
. Since all definitions in
Mod
(
𝑇/𝑉
)
are well-typed terms in context
Γ
, the record
⟬
⋅
⟭
does not actually occur anywhere in the simplified application
Mod
(
𝑇/𝑉
)
(
⟬
⋅
⟭
)
, which makes this expression well-typed.
8.4Implementation
The implementation of record types and the
Mod
(
⋅
)
can be found at [LFX]
/scala/.../Records
. They are used extensively in the Math-in-the-Middle archive (see Section 3.4).
For a particularly interesting example that occurs in MitM, consider the theories for modules and vector spaces (over some ring/field) given in Listing 8.1 to Listing 8.4, which elegantly follow informal mathematical practice. Going beyond the syntax introduced so far, these use parametric theories. Our implementation extends
Mod
to parametric theories as well, namely in such a way that e.g.
Mod
(
module
theory
)
∶
∏
𝑅
∶
ring
Mod
(
module
theory
(
𝑅
)
)
and correspondingly for fields. Thus, we obtain
vectorspace
=
𝜆F
∶
field
.
((
module
F
)
+
...
)
and, e.g.,
vectorspace
(
R
)
<∶
module
(
R
)
, where vectorspace, module, field and ring are all
Mod
(
⋅
)
-types.
_
_
Because of type-level parameters, this requires some kind of parametric polymorphism in the type system. For our approach, the shallow polymorphism module that is available in PLF (see Remark 4.6) is sufficient.
Listing 8.1: A Theory of Rings
6
Listing 8.2: A Theory of Fields
7
Listing 8.3: A Theory of Modules
8
Listing 8.4: A Theory of Vector Spaces
9
Formation:
|
fully typed
|
|
|
Introduction:
|
fully defined
like
|
|
|
Elimination:
|
|
|
|
Type checking:
For
additionally for
|
|
|
Equality checking (extensionality):
For every
|
|
|
Computation:
|
|
|
|
|
|
|
|
Figure 8.4: Rules for Records
Chapter 9
Conclusion and Additional Features
The features presented in this part, collectively, yield a logical framework capable of representing the logical foundations of most formal systems encountered in practice. Our modular approach allows for picking and choosing the required features for a given formalization in a compositional manner, while avoiding the potential overhead of unnecessary rules for a given formalization.
Additional FeaturesNotably, many formal systems offer additional features as “syntactic sugar”, with an abbreviation-like semantics that can already be fully represented in the underlying logic, but which allow for formalizing content more conveniently. Examples include record and function updates in PVS (e.g. “
𝑟
is defined as the record
𝑠
, except for field
𝑎
, which has value
𝑡
instead”), let-operators that introduce new variables as abbreviations for complex expressions, or sections in Coq (where local variables are introduced as constants, references to which are lambda-abstracted away when closing a section).
In the interest of preserving the syntactic representation of an imported library as much as possible, we prefer adding these features in the formalization of a system’s foundation. Due to their trivial semantics, these features rarely pose a challenge for implementation, and because they can be elaborated away (and usually are in the system itself during checking), they do not introduce hurdles for library integration.
We have implemented some of these convenience features, including a let-operator
, an object-level substitution function
, a primitive polymorphic type of lists
, and Coq-like sections
, the latter of which is very naturally implemented as a structural feature and used in our import of the Coq library [MRS]. Furthermore, Fulya Horozal previously implemented an extension of LF with finite sequences and flexary operators [Hor14].
1
2
3
4
Module-Level ExtensionsThe
Mod
-operator described in Chapter 8 additionally uses Module-level expressions (namely references to theories), but is still a rule-based object-level typing feature of the underlying logic. However, it is also possible to extend Mmt’s module system itself in various ways. These are (regarding their implementation) rather recent additions to the Mmt system and are not central to this thesis, however, they offer interesting extensions for the goal of this part. In particular, module-level features are entirely orthogonal to the logic-level typing feature discussed in this part; hence they are intrinsically compatible with the modular development approach central to this thesis, and composable with all features herein:
•
5
•
•
In addition to allowing the formalization of the foundations of formal systems, our modular logical framework is used as a basis for the Math-in-the-Middle library (see Section 3.4), which serves as a case study and explorative playground for evaluation and inspiration for additional features.
Additionally, the latter allows us to experiment with combining and unifying various different approaches and combinations of features, e.g.
Mod
-types for theories defined via theory combinators that are the codomains of implicit morphisms, or identifying natural numbers as a
-type with a declarative implementation using the Peano axioms, and a separate type populated by literals.
W
Part III
Translating Mathematical Libraries into a Universal Format
To integrate formal libraries from various sources and to implement generic knowledge management services, as is one of the objectives of this thesis (see Section 3.5), it is necessary to have a unifying framework and language in which these libraries are represented. Naturally, OMDoc/Mmt is (due to its generality and modularity) the prime candidate for such a framework. This Part covers the process of translating one library from an external system into the OMDoc/Mmt language.
The OAF project (see Section 3.2) in particular focuses on integrating libraries from theorem prover systems; however, the methodology involved generalizes easily to the purposes of the OpenDreamKit project (see Section 3.3), where the aim is to integrate databases of mathematical objects and their properties (e.g. finite groups), computer algebra systems and related systems. Together, they cover three aspects of the tetrapod (see Chapter 1), namely inference, tabulation and computation. Additionally, the Mmt system itself manages the organization aspect.
In all settings, the process involved in translating a library
𝐿
from system
𝑆
to OMDoc/Mmt entails the following steps:
1.
2.
3.
There is a continuum of possible translations, ranging from perfect adequacy to mere representation. In the best case, we can fully type check the result of the translation and the precise semantics of the original content is preserved under translation. In the worst case, we obtain a representation of the symbols in the original library as untyped and undefined constants. Ideally we want the former, however, depending on the specifics of the system, this might be impossible to achieve without reimplementing the entire system within Mmt. For example, a system with predicate subtypes and a powerful automated prover will usually rely on the latter to type check contents in a manner that can only be reproduced with a similarly powerful prover. In the case of (untyped) computer algebra systems, it might be impossible to check whether a given expression in the language of the system is even well-formed without passing it back to the system itself and evaluating the result.
However, it should be noted that for many knowledge management services such as alignments (see Chapter 14) or (in a limited form) content translation (see Chapter 15), even mere representation is sufficient to obtain useful results. From that point of view, an adequate translation is not strictly necessary, but usually the usefulness of generic services strongly correlates with the amount of details and structure preserved in the translated library.
In this thesis we only consider the statements in a given formal library (declarations, definitions, theorems, axioms) – i.e. we do not cover proofs or specific implementations of methods in computer algebra system. Importing proofs poses additional challenges and requires an adequate proof language sufficiently generic to subsume the various ways proofs can be represented in formal systems, if they can be exported from the system at all. To the best of my knowledge, such a generic proof language that can serve as an adequate target for translations from theorem prover systems does not yet exist. All knowledge management services presented in this thesis consequently do not benefit from any knowledge about proofs, other than (potentially) the binary information whether a statement is proven at all and (possiby) dependency management.
The methodology described in this part has been applied to various systems and libraries by several people, resulting in OMDoc/Mmt imports for Mizar [Ian+13], HOL Light [KR14], TPTP [Sut09], IMPS [Bet18], Isabelle (as-of-yet unpublished), Coq [MRS] and PVS [Koh+17c]. As a representative example, the latter is covered in detail in Chapter 10. Moving on to less formal corpora, Chapter 11 demonstrates how to handle databases of mathematical objects, exemplary the LMFDB, and Chapter 12 covers computer algebra systems, using the GAP and Sage systems as examples.
Chapter 10
Integrating Theorem Prover Libraries - PVS
Disclaimer:
The contents of this chapter have been published in [Koh+17c] with coauthors Michael Kohlhase, Sam Owre and Florian Rabe.
Both the writing as well as the theoretical results were developed in close collaboration between the authors, hence it is impossible to precisely assign authorship to individual authors. My contribution with regards to these can be assumed to be minor, although the writing has been reworked for this thesis.
The implementations (as described below) of the PVS-to-OMDoc translation, the formalization of the PVS foundation and the implementation of the logical framework used therein are my contribution.
10.1Introduction and Preliminaries
PVS [ORS92] is a verification system, combining language expressiveness with automated tools. Its language is based on higher-order logic, and is strongly typed. It includes types and terms for concepts such as: numbers, records, tuples, functions, quantifiers, and recursive definitions. Full predicate subtypes are supported, which makes type checking undecidable; PVS generates type obligations (TCCs) as artefacts of type checking. For example, division is defined such that the second argument is nonzero, where nonzero is defined:
Note that functions in PVS are total; partiality is only supported via subtyping.
Beyond this, the PVS language has structural subtypes (i.e., a record that adds new fields to a given record), dependent types for records, tuples, and functions, recursive and co-recursive datatypes, inductive and co-inductive definitions, theory interpretations, and theories as parameters, conversions, and judgements that provide control over the generation of proof obligations. Specifications are given as collections of parameterized theories, which consist of declarations and formulas, and are organized by means of imports.
The PVS prover is interactive, but with a large amount of automation built in. It is closely integrated with the type checker, and features a combination of decision procedures, including BDDs, automatic simplification, rewriting, and induction. There are also rules for ground evaluation, random test case generation, model checking, and predicate abstraction. The prover may be extended with user-defined proof strategies.
PVS has been used as a platform for integration. It has a rich API, making it relatively easy to add new proof rules and integrate with other systems. Examples of this include the model checker, Duration Calculus, MONA, Maple, Ag, and Yices. The system is normally used through a customized Emacs interface, though it is possible to run it standalone (PVSio does this), and PVS features an XML-RPC server (developed independently of the work presented here) that will allow for more flexible interactions. PVS is open source, and is available at
𝚑𝚝𝚝𝚙
:
//
𝚙𝚟𝚜
.
𝚌𝚜𝚕
.
𝚜𝚛𝚒
.
𝚌𝚘𝚖
.
As an example, Figure 10.1 gives a part of the PVS theory defining equivalence closures on a type
T
in its original syntax. PVS uses upper case for keywords and logical primitives; square brackets are used for types and round brackets for term arguments. The most important declarations in theories are
•
•
•
10.2Formalizing Foundations
To define the language of PVS in Mmt, we carry out two steps.
Firstly, we choose a suitable logical framework by picking the necessary features presented in Part II. Notably, we include three features: anonymous record types (see Chapter 8), predicate subtypes (see Chapter 7), and imports of multiple instances of the same parametric theory. The latter is achieved by a structural feature (see Section 6.3) that elaborates into the lambda-abstracted declarations of the imported theory.
Then we use this logical framework to define the Mmt theory for PVS. Listing 10.1 shows the most fundamental constants of this theory.
Listing 10.1: The Fundamental Higher-Order Logic of PVS
1
We begin with a definition of PVS’s higher-order logic using only LF features. This includes dependent product and function types
, classical booleans, and the usual formula constructors (see Listing 10.1). This is novel in how exactly it mirrors the syntax of PVS (e.g., PVS allows multiple aliases for primitive constants) but requires no special Mmt features.
2
We declare three constants for the three types of built-in literals together with Mmt rules for parsing and typing them. Using the new framework features, we give a shallow encoding of predicate subtyping (see Listing 10.2 for the new typing rule), a shallow definition of anonymous record types, as well as new declarations for PVS-style inductive and co-inductive types.
Listing 10.2: PVS-Style Predicate Subtyping in MMT and the Corresponding Rule
3
10.3Importing Libraries
The PVS library export required three separate developments:
Firstly, Sam Owre has extended PVS with an XML export. This is similar to the L
A
T
E
X extension in PVS, which is built on the Common Lisp Pretty Printing facility. The XML export was developed in parallel with a Relax NG specification for the PVS XML files. Because PVS allows overloading of names, infers theory parameters, and automatically adds conversions, the XML generation is driven from the internal type-checked abstract syntax, rather than the parse tree. Thus the generated XML contains the fully type-checked form of a PVS specification with all overloading disambiguated. Future work on this will include the generation of XML forms for the proof trees. Figure 10.2 exemplary shows the (slightly simplified) XML representation of the PVS theory presented in Figure 10.1.
Secondly, Florian Rabe documented the XML schema used by PVS as a set of case classes in Scala and wrote a generic XML parser in Scala that generates a schema-specific parser from such a set of inductive types (see Figure 10.3 for part of the specification). That way any change to the inductive types automatically changes the parser. While seemingly a minor implementation detail, this was critical for feasibility because the XML schema changed frequently along the way.
Thirdly, I wrote an Mmt plugin that parses the XML files generated by PVS and systematically translates their content into OMDoc/Mmt, using the symbols from my formalization of the PVS logic. This includes creating various generic indexes that can be used later for searching the content.
All processing steps preserve source references, i.e., URLs that point to a location (file and line/column) in a source file (the quatruples of numbers at place= in Figure 10.2 and <link rel="...?sourceRef" in Figure 10.4).
The table in Figure 10.5 gives an overview of the sizes of the involved libraries and the run times
of the conversion steps. We note that the XML encoding considerably increases the size of representations. This is due to two effects: the internal, disambiguated form contains significantly more information than the user syntax (e.g. theory parameter instances and reconstructed types), and XML as a machine-oriented format is naturally more verbose. Furthermore, OMDoc uses OpenMath for term structures, which again increases file size. While in practice the file sizes are no problem for the Mmt tools presented here, the analogous import for the Isabelle libraries due to Makarius Wenzel
has prompted us to compress the OMDoc files by default, significantly reducing their sizes. The Mmt system can read the compressed files directly without need for a user to extract them manually.
6
7
2
Figure 10.4: A Part of the Function EquivalenceClosure/EquivClos in OMDoc
5
|
|
||||||
|
|
PVS source
|
PVS
|
XML
|
|||
|
|
size/gz
|
check time
|
result size/gz
|
run time
|
result size/gz
|
run time
|
|
|
||||||
|
Prelude
|
189.7/46.6kB
|
33s
|
23.5/.67MB
|
11s
|
83.3/1.6MB
|
3m41s
|
|
|
||||||
|
NASA Lib
|
1.9/.426MB
|
23m25s
|
387.2/8.9MB
|
3m11s
|
2.5/.04GB
|
58m56s
|
|
|
Figure 10.5: File Sizes of the PVS Import at Various Stages
Chapter 11
Integrating External Databases (LMFDB)
Having covered importing fully formal theorem prover libraries, we will now turn our attention to systems, where the semantics of the imported ontology is less precise. Specifically, we will look at our methodologies to integrate external databases of mathematical objects (exemplarily the LMFDB) and computer algebra systems (examplarily the GAP and Sage systems). In other words, the tabulation and computation aspects of the tetrapod (see Chapter 1).
Integrating these into the Mmt system was done as part of the OpenDreamKit project (see Section 3.3) with the aim to facilitate data exchange and remote procedure calls between the systems, using the Math-in-the-Middle approach (MitM, see Section 3.4).
Disclaimer:
Parts of the contents of this and the following chapter have been published in [Deh+16] with coauthors Paul-Olivier Dehaye, Mihnea Iancu, Michael Kohlhase, Alexander Konovalov, Samuel Lelièvre, Markus Pfeiffer, Florian Rabe, Nicolas M. Thiéry and Tom Wiesing.
Both the writing as well as the theoretical results were developed in close collaboration between the authors, hence it is impossible to precisely assign authorship to individual authors. With respect to the writing itself, this applies only to the introductory sections and descriptions of the systems involved, and my contribution with regards to these can be assumed to be minor. However, they provide important context for the methodologies described.
My contribution in this and the following chapter consists of
•
A detailed description of the workflows involved in importing the system ontologies. •
Implementing the importers for Sage and GAP. •
Implementing the schema theories for the LMFDB. •
Minor parts in the implementation of the infrastructure connecting the schema theories to the LMFDB backend. 11.1LMFDB Knowledge and Interoperability
11.1.1Introduction
The
𝐿
-functions and modular forms database is a project involving dozens of mathematicians who assemble computational data about
𝐿
-functions, modular forms, and related number theoretic objects. The main output of the project is a website, hosted at
𝚑𝚝𝚝𝚙
:
//
𝚠𝚠𝚠
.
𝚕𝚖𝚏𝚍𝚋
.
𝚘𝚛𝚐
, that presents this data so that it can serve as a reference for research efforts, and is accessible for postgraduate students. The mathematical concepts underlying the LMFDB are complex and varied, and a large amount of effort has been focused on how to relay knowledge, such as mathematical definitions and their relationships, to data and software. For this purpose, the LMFDB has developed so-called knowls, which are a technical solution to present L
A
T
E
X-encoded information interactively, heavily exploiting the concept of transclusion. The end result is a very modular and highly interlinked set of definitions in mathematical vernacular which can be easily anchored in vastly different contexts, such as an interface to a database, to browsable data, or as constituents of an encyclopedia [Lmfa]. The LMFDB code is primarily written in Python, with some reliance on Sage for the business logic. The backend uses the database system PostgreSQL. Again, due to the complexity of the objects considered, many idiosyncratic encodings are used for the data. This makes the whole data management lifecycle particularly tricky, and dependent on different select groups of individuals for each component.
As the LMFDB spans the whole “vertical” workflow, from writing software, to producing new data, up to presenting this new knowledge, it is a perfect test case for a large scale case study of the MitM approach. Conversely, a semantic layer would be beneficial to integrating its activities across data, knowledge and software.
11.2Integrating the LMFDB with Math-in-the-Middle
Among the components of the LMFDB, elliptic curves stand out as a well-documented data set, and a source of best practices for other areas. We have generated MitM interface theories for LMFDB elliptic curves by (manually) refactoring and flexiformalizing the L
s
T
A
T
E
X source of knowls into
E
X (see Listing 11.1 for an excerpt), which can be converted into flexiformal OMDoc/MMT automatically. The MMT system can already type-check the definitions, avoiding circularity and ensuring some level of consistency in their scope and make it browsable through MathHub.info. Listing 11.1:
s
T
E
X Flexiformalization of an LMFDB Knowl (original: [Lmfb])The first step consisted of translating these informal definitions into progressively more exhaustive MMT formalizations of mathematical concepts (see Listing 11.2) as part of the MitM library. The two representations are coordinated via the theory and symbol names – we can see the
s
T
E
X representation as a human-oriented documentation of the Mmt equivalent. Listing 11.2: MMT Formalization of Elliptic Curves
1
This gives us the necessary formal content to more precisely specify the semantics of the contents of the LMFDB elliptic curves database. The entries in this database represent elliptic curves and contain fields such as their Cremona-label, that uniquely identifies a curve, its
2
-adic generators, conductor, degree, etc.
The core methodology to connect a database to the MitM infrastructure is described in detail in [WKR17]. It relies on the following strategy:
•
Listing 11.3: The Schema Theory for Elliptic Curves
2
•
•
_
_
•
_
•
_
This methodology is generic and extends beyond the motivating databases in the LMFDB. [BKR] describes how to add the necessary infrastructure to cover additional databases and generate e.g. query interfaces and other services directly from schema theories.
Listing 11.4 shows how we can use the resulting constants in practice. The included theory
db
?
ec
curves
is the virtual theory generated by the schema theory in Listing 11.3. The constant
11a1
is computed as described above from the entry in the database with label
11a1
when it is first referenced. Applying the
conductor
function to this constant yields an
int
lit
with the value obtained from querying the database.
_
_
Listing 11.4: Using Virtual Theories
3
As a result, we obtain access to all the objects in the databases from within the Mmt system. The objects have globally unique Mmt URIs, and all properties and values for a given object that are stored in the database are obtainable using the Mmt syntax. In other words: The database behaves just like any other collection of Mmt theories and can consequently be used for any of the applications described in this thesis; most importantly those described in Part IV and those relevant for the OpenDreamKit project.
Chapter 12
Integrating Generic Ontologies
12.1Distributed Collaboration with GAP/Sage
Another aspect of interoperability in a mathematical virtual research environment – as is the goal of OpenDreamKit – is the possibility of distributed multisystem computations, where e.g. a given system may decide to delegate certain subcomputations or reasoning tasks to other systems.
There are already a variety of peer-to-peer interfaces between systems in the OpenDreamKit project (see Figure 3.3), which are based on the handle paradigm; for example Sage includes, among others, interfaces for GAP, Singular, and PARI. In this paradigm, when a system
𝐴
delegates a calculation to a system
𝐵
, the result
𝑟
of the calculation is not converted to a native
𝐴
object; instead
𝐵
just returns a handle
ℎ
(or reference) to the object
𝑟
. Later,
𝐴
can run further calculations with
𝑟
by passing it as argument to functions or methods implemented by
𝐵
. Some advantages of this approach are that we can avoid the overhead of back and forth conversions between
𝐴
and
𝐵
, and that we can manipulate objects of
𝐵
from
𝐴
, even if they have no native representation in
𝐴
.
The next desirable feature is for the handle
ℎ
to behave in
𝐴
as if it was a native
𝐴
object; in other words, one wants to adapt the API satisfied by
𝑟
in
𝐵
to match the API for the same kind of objects in
𝐴
. For example, the method call
h.cardinality
(
)
on a Sage handle h to a GAP object G should trigger in GAP the corresponding function call
Size
(
G
)
.
This can be implemented using the classical adapter pattern, mapping calls to Sage’s method to corresponding GAP methods. Adapter classes have already been implemented for certain types of objects, like Sage’s
PermutationGroup
or
MatrixGroup
. However, this implementation lacks modularity: for example, if h is a handle to a mere set S, Sage cannot use the adapter method that maps
h.cardinality
(
)
to
Size
(
S
)
, because this adapter method is only available in the above two adapter classes.
To get around this problem, Nicolas M. Thiéry and others have worked on a more semantic integration, where adapter methods are made aware of the type hierarchies of the respective systems, and defined at the highest available level of generality, as in Listing 12.1.
Listing 12.1: A Semantic Adapter Method in Sage
This peer-to-peer approach however does not scale up to a dozen systems. This is where the Math-in-the-Middle paradigm comes to the rescue. With it, the task is reduced to building interface theories and interface views into the core MitM ontology in such a way that the adapter pattern can be made generic in terms of the MitM ontology structure, without relying on the concrete structure of the respective type systems. Then the adapter methods for each peer-to-peer interface can be automatically generated. In our example the adapter method for
cardinality
can be constructed automatically as soon as the MitM interface views link the
cardinality
function in the Sage interface theory on Sets with the
Size
function in the corresponding interface theory for GAP.
Exporting the GAP Knowledge: Type System DocumentationThe GAP type system encodes a wealth of mathematical knowledge, which can influence method selection. For example establishing that a group is nilpotent will allow for more efficient methods to be run for finding its centre. The main difference between Sage and GAP lies in the method selection process. In Sage the operations implemented for an object and the axioms they satisfy are specified by its class which, together with its super classes, groups syntactically all the methods applicable in this context. In GAP, this information is instead specified by the truth-values of a collection of independent filters, while the context of applicability is specified independently for each method (the details are discussed later). Breuer and Linton describe the GAP type system in [BL] and the GAP documentation [Gap] also contains extensive information on the types themselves.
GAP allows some introspection of this knowledge after the system is loaded: the values of those attributes and properties that are unknown on creation, can be computed on demand, and stored for later reuse.
As a first step in generating interface theories for the MitM ontology, GAP developers have implemented tools to access mathematical knowledge encoded in GAP, such as introspection inside a running GAP session, export to JSON to import to MMT, and export as a graph for visualization and exploration. The JSON output of the GAP object system with default packages is currently around 11 Megabytes, and represents a knowledge graph with 540 vertices, 759 edges and 8 connected components, (see Figures 12.1,12.2). If all packages are loaded, this graph expands to 1616 vertices, 2178 edges and 17 connected components.
There is, however, another source of knowledge in the GAP universe: the documentation, which is provided in the GAPDoc format [LN12]. Besides the main manuals, GAPDoc is adopted by 97 out of the 130 packages currently redistributed with GAP. Conventionally GAPDoc is used to build text, PDF and HTML versions of the manual from a common source given in XML. The reference manual has almost 1400 pages and the packages add hundreds more.
The GAPDoc sources classify documentation by the type of the documented object (function, operation, attribute, property, etc.) and index them by system name. In this sense they are synchronized with the type system (which e.g. has the types of the functions) and can be combined into flexiformal OMDoc/MMT interface theories, just like the ones for LMFDB in Section 11.1. This conversion is currently under development and will lead to a significant increase of the scope of the MitM ontology.
As a side-effect of this work, Markus Pfeiffer discovered quite a few inconsistencies in the GAP documentation, which came from a semi-automated conversion of GAP manuals from the T
E
X-based manuals used in GAP 4.4.12 and earlier. In response, he implemented a consistency checker for the GAP documentation, which extracts type annotations from the documented GAP objects and compares them with their actual types. It immediately reported almost 400 inconsistencies out of 3674 manual entries, 75% of which have been eliminated in a subsequent cleanup. Figure 12.1: The GAP Knowledge Graph.
Semantics in the Sage Category SystemThe Sage library includes 40k functions and allows for manipulating thousands of different kinds of objects. In any large system it is critical to tame code bloat by
i)
identifying the core concepts describing common behavior among the objects; ii)
implementing generic operations that apply on all objects having a given behavior, with appropriate specializations when performance calls for it; iii)
designing or choosing a process for selecting the best implementation available when calling an operation on some objects. Following mathematical tradition and the precedent of the Axiom, Fricas, or MuPAD systems, Sage has developed a category-theory-inspired “category system”, and found a way to implement it on top of the underlying Python object system [Dev16; SC]. In short, a category specifies the available operations and the axioms they satisfy. This category system models taxonomic knowledge from mathematics explicitly and uses it to support genericity, control the method selection process, structure the code and documentation, enforce consistency, and provide generic tests.
To generate interface theories from the Sage category system, Nicolas Thiéry et al are experimenting with a system of annotations in the Sage source files. Consider for instance the situtation in Figure 12.3 where we have annotated the
Sets
(
)
category in Sage with
@semantic
lines that state correspondences to other interface theories. From these the Sage-to-MMT exporter can generate the respective interface theories and views.
In ongoing experiments, variants of the annotations are tested for annotating existing categories without touching their source files and providing the signature or the corresponding method names in other systems when this information has not yet been formalized elsewhere.
12.2Representing the GAP Ontology in OMDoc/Mmt
To facilitate the kind of integration demanded by the OpenDreamKit project (such as remote procedure calls), we need appropriate representations of the functionalities offered by the targeted systems. In particular, we need to represent the underlying ontologies of the systems.
In the case of GAP, this ontology is based on the notion of filters, as described above: Every object belongs to several categories and satisfies a list of filters, which can be thought of as predicates (e.g.
IsGroup
and
Abelian
, jointly specifying that the object is an abelian group). The way a user interacts with the system is by calling operations (e.g. the order of a group) on an object, which has to satisfy certain filters in order for the operation to be applicable. Notably however, each operation is implemented by possibly multiple methods: specific algorithms for computing an operation. The system uses the known filters of an object to determine which specific method to use for a specific operation. For example,the order of a group might be easier to compute if the group is known to be abelian, hence the system would choose a more appropriate method to compute the order of an abelian group than for a group that is not (known to be) abelian. Notably, the system chooses the method automatically; a user only interacts with the system by calling operations.
This notion of filters effectively yields a soft typing system, which we formalize as in Listing 12.2.
Listing 12.2: The GAP Foundational Ontology in Mmt
1
The most primitive notions in the GAP ontology are objects and categories, which we represent as LF types. Naturally, GAP uses literals for booleans, integers and real numbers; consequently we introduce new types for these and populate them with literals as well (lines 5–10).To convert our literals to GAP objects, we introduce conversion functions (lines 12–14). These GAP native literals do not occur in the JSON-export; they are needed for most services implemented in the OpenDreamKit project however. For example, they are needed for translating even simple commands such as
2
+
2
into the GAP ontology to allow for remote procedure calls.
We formalize filters as functions
object
→
type
and effectively use judgments-as-types for the judgment
o
f
representing “object
o
has filter
f
”. The
filter
and
-operator corresponds to a conjunction of filters.
catFilter
allows to build the corresponding filter to a given category, conversely,
CategoryCollection
allows to form the category of all objects satisfying a given filter. An operation
𝑓
is considered a property if it returns a boolean value. In this case we can form the corresponding filter
propertyFilter
(
f
)
that an object satisfies iff the operation
𝑓
returns true on that object.
$
_
Using this formalization, we can systematically translate the contents of the JSON-export of the GAP ontology mentioned above into OMDoc/Mmt declarations.
The export contains a JSON object for each category and operation. For categories, it additionally contains the list of implied filters, for example, the category
IsAbelianNumberFieldPolynomialRing
implies filters such as
IsCollection
and
IsDuplicateFree
. Naturally, a category
𝐶
is translated to a constant of type
category
. For each implied Filter
𝐹
, we generate an additional constant of type
∏
𝑥
∶
object
𝑥
catFilter
(
𝐶
)
→
𝑥
𝐹
.
$
$
Unfortunately, the current JSON export does not yet contain any information regarding the required filters of an operation, or the filters of the return object. In the course of the work on the export described above, attempts are being made to make those accessible as well. For now, all operations are correspondingly simply typed as
object
𝑛
→
object
, depending on the arity.
While more detailled information regarding the filters would be desirable for documentation purposes, the above import methodology is sufficient to translate GAP objects, operations, and applications of operations to objects into well-formed OMDoc/Mmt expressions.
2
2
12.3Representing the Sage Ontology in OMDoc/Mmt
Having covered the GAP ontology already, extending the same methodology to a smiliar system as Sage is in many respects straight forward. Notably however, whereas GAP implements its own fundamental notions for even simple concepts like integers, Sage is a python library, and as such operates on many of python’s datatypes directly. Consequently, we start with a formalization of the python datatypes relevant for our purposes:
Listing 12.3: Python Datatypes in Mmt
3
Note that since python uses a dynamic type system, we can not assign a fixed type to a given python object. This is reflected by our ontology, which specifies even the constructors for the various types as having type
python
object
rather than some dependent type on
python
type
. As with GAP, these datatypes are not needed for the import of the ontology, but rather for services that operate on expressions within the ontology.
_
_
For the ontology itself, we use – as with GAP– a generic type
object
.
Listing 12.4: The Sage Foundational Ontology in Mmt
4
As mentioned above, knowledge in Sage– and hence in its JSON-export – is organized in categories (e.g.
Ring
). A category satisfies certain axioms/properties (which after translating live in the type
prop
in Listing 12.4) and implements certain structures (or signatures, which will live in
structural
) – e.g. the category of rational polynomials implements the signature of a ring, and satisfies (among others) the axioms of a commutative ring.
Each category provides certain methods; namely element methods that operate on the elements in an object of the category (e.g. addition or multiplication in a ring), parent methods that operate on an object of the category directly (e.g. its characteristic or degree), and morphism methods that operate on the respective morphisms of the category.
We translate categories directly to OMDoc/Mmt theories. The JSON-export currently provides no further information on neither the axioms nor the structures. Correspondingly, we treat them as atomic objects of the respective type. We collect all axioms and structures occuring during the import and store them in separate generated theories
Axioms
and
Structures
, which are included in all the theories corresponding to categories. If a category
𝐶
statisfies an axiom
𝐴
or implements a structure
𝑆
, we add to the theory constants of types
⊢
𝐴
or
≤
𝑆
, respectively.
As with GAP, we currently have no information about the input or output types (or categories) of methods; hence we are left with translating those to functions
object
𝑛
→
object
, depending on their arity. Again, this yields a sufficient formalization of the functionalities provided by Sage to facilitate the goals of the OpenDreamKit project.
5
5
Chapter 13
Conclusion
We have seen how, using the logical frameworks developed in Part II, we can represent libraries (in the broadest sense) of formal systems in OMDoc/Mmt, including both fully formal libraries as obtained by theorem prover systems, as well as the ontologies of less formal systems such as computer algebra systems and databases of mathematical objects.
In Section 13.1 below, we will look at some suggested applications enabled by representing a single (fully formal) library in OMDoc/Mmt; however, in the light of the title of this thesis, we are much more interested in applications enabled across libraries. Consequently, the important result of this part is less the ability to represent a single library, rather than having a universal framework in which multiple libraries/ontologies of various systems can be (and are) represented, as a prerequisite for the knowledge management service described in Part IV. As such, the intended goals of the OpenDreamKit project are much more in line with the aims of this thesis, even though the systems involved therein (covered in Chapters 11 and 12) are less formal than the fully formal – and hence for knowledge management purposes more interesting – libraries discussed in Chapter 10.
13.1Enabled Applications
With the OMDoc/Mmt translation of formal libraries, the originating systems gain access to library management facilities implemented at the OMDoc/Mmt level.
There are two ways to exploit this: publishing the converted libraries on a dedicated server, like the MathHub system [MH], or running the OMDoc/Mmt toolstack locally. Both options offer similar functionality, the main difference is the intended audience: the first option is for outside users who want to access the libraries, and the latter is for users who develop new content or refactor the library.
1
MathHub (see Section 4.1.2) bundles a GitLab-based repository manager with Mmt and various periphery systems into a common, web-based user interface. We commit the exported libraries – exemplarily from PVS– as OMDoc/Mmt files into a repository to make these available via the i) MathHub user interface, ii) Mmt presentation web server, iii) Mmt web services, and iv) the MathWebSearch daemon. All of these components give the user different ways of interacting with the system and content. Below we explore three examples that are directly useful for PVS users, but the services apply similarly to users of other systems.
The local workflow installs OMDoc/Mmt tools on the same machine as PVS. In that case, users are able to browse the current version of the available PVS libraries including all experimental or private theories that are part of the current development. This also enables PVS to use OMDoc/Mmt services as background tools that remain transparent to the PVS user.
In both workflows, OMDoc/Mmt-based periphery systems become available to the PVS user that are either not provided by the PVS tools or in a much restricted way. We will go over the three most important ones in detail.
13.1.1Browsing and Interaction
The transformed PVS content can be browsed interactively in the document-oriented MathHub presentation pages (theories as active documents) and in the Mmt web browser (see Figure 13.1). Both allow interaction with the PVS content via a generic Javascript-based interface. This provides buttons to toggle the visibility of parts computed by PVS – e.g. omitted types and definitions – at the declaration level. The right-click menu shown in Figure 13.1 is specific to the selected sub-formula (highlighted in gray); here we have eight applicable interactions which range from inferring the subformula type via definition lookup to management actions such as registering an alignment to concepts in other libraries. New interactions can be added as they become available in the MMT system.
The Mmt instance in the local workflow provides the additional feature of inter-process communication between PVS and Mmt as a new menu item: the action navigate to this declaration in connected systems. Florian Rabe implemented a listener for this action that forwards the command to PVS via an XML-RPC call at the default PVS port. Correspondingly, Sam Owre implemented an experimental handler in the PVS server that opens the corresponding file in the PVS emacs system and navigates to the relevant line – unfortunately, due to time constraints this functionality has so far not been fully developed and integrated in an official PVS release.
13.1.2Graph Viewer
MathHub includes a theory graph viewer developed by Marcel Rupprecht that allows interactive, web-based exploration of the OMDoc/Mmt theory graphs [RKM17]. It builds on the visjs JavaScript visualization library [VJS], which uses the HTML5 canvas to layout and interact with graphs client-side in the browser.
PVS libraries make heavy use of theories as a structuring mechanism, which makes a graph viewer for PVS particularly attractive. Figure 13.2 shows the full graph in a central-gravity layout induced by the PVS prelude, where we have (manually) clustered the subgraphs for bit vectors and finite sets (the orange barrel-shaped nodes). The lower right corner shows a zoomed-in fragment.
The theory graph allows dragging nodes around to fine-tune the layout. Hovering over a node or edge triggers a preview of the theory. All nodes support the same context menu actions in the graph viewer as the corresponding theories do in the browser above. Thus, it is possible to select a theory in the graph viewer and then navigate to it in the browser or (if run locally) in the PVS system.
13.1.3Search
MathWebSearch [KŞ06] is an OMDoc/Mmt-level formula search engine that uses query variables for subterms and first-order unification as the query language. It is developed independently, but Mmt includes a plugin for generating MathWebSearch index files using its content MathML interface. Thus, any library available to Mmt can be indexed and searched via MathWebSearch. Moreover, Mmt includes a frontend for MathWebSearch so that search queries can be supplied in any format that Mmt can understand, e.g., the XML format produced by PVS.
Mmt exposes the search frontend both in its GUI for humans and as an HTTP service for other systems. Here we use the latter: We have added a feature to the PVS emacs interface that allows users to enter a search query in PVS syntax. PVS parses the query, type-checks it, and converts it to XML. The XML is sent to Mmt, which acts as the mediator between the proof assistant — here PVS — and library management periphery — here MathWebSearch— and returns the search results to PVS.
The PVS user enters the PVS query
EquivClos
(
?
A
)
, where we have extended the PVS syntax with query variables like ?A. After OMDoc/Mmt translation, this becomes the MathWebSearch query in Figure 13.3 — note the additional symbols from LF introduced by the representation in the logical framework. The representation also introduces unknown meta-variables for the domain and range of the
EquivClos
function, which become the additional query variables I1 and I1. MathWebSearch returns a JSON record with all results, and we show the first two in Figure 13.4: two occurrences of (instances of)
EquivClos
(
?
A
)
in two declarations in the theory
EquivalenceClosure
Figure 10.1. The attribute lib_name is the name of the library; by PVS convention, it is empty for the Prelude. The attributes theory_name and name give the declaration that contains the match, and Position gives the path to its subterm that matched the query.
Figure 13.5 shows what the query will look like while doing a PVS proof. The current implementation is just a proof-of-concept — for the mature version the part of PVS that sends the query to the Mmt server and displays the results still has to be implemented thoroughly. But the remaining steps are straightforward.
Future work could potentially exploit this functionality to search specifically for existing theorems that may be helpful in a specific part of an ongoing PVS proof.
13.1.4Querying Across Libraries
Going beyond a single system, we exported relational data for several theorem prover libraries as RDF triples for more advanced queries without relying on the Mmt system (as published in [Con+]). This relational data is already contained in the underlying OMDoc representation of Mmt archives and can thus be easily generated from imported libraries. In conjunction with alignments (which we will discuss in Chapter 14) this allows for conveniently querying for mathematical concepts across different theorem prover libraries using an arbitrary triple store as backend. For example, for the cited paper I set up an instance of Virtuoso Open-Source
, providing a SPARQL endpoint. Using the query
2
we receive as results (which are shown in Figure 13.6) all functions and predicates defined by induction on the natural numbers regardless of implementation - by only considering those symbols that are aligned with the Math-in-the-Middle symbol for the type of natural numbers.

Figure 13.5: Example for Displaying the Query Result in PVS

Figure 13.6: Virtuoso Output for the Example Query using Alignments
Part IV
Cross-Library Knowledge Management
Now that we have a unifying framework in which many different libraries are represented, we can approach the problem of integrating these libraries with each other, e.g. to translate knowledge between them.
Naturally, these libraries use different foundations; which is reflected in Mmt as the contained theories having different meta theories. Consequently, as a first step one might want to integrate the foundations themselves, i.e. by implementing systematic translations between them.
An obvious challenge arises here, namely that the foundations used by formal systems are usually mutually incompatible. This is obvious for systems using fundamentally different logics, such as set theories vs. type theories, but even if the platonic foundation used by two systems is the same (e.g. higher-order logic), there are subtle differences in the specific implementations – and hence ontologies – of the foundations, that make implementing such translations less straight-forward than one might at first suspect. For example, the systems might offer different additional features (e.g. records, subtyping) where there is no universally valid elimination procedure. Additionally, systems usually offer distinct convenience operations, automations, module systems and conventions, that are suggestive of different approaches and best practices for formalizing content, leading to very different library developments for the same mathematical concept even if the foundation used by the systems is the same.
As a result, even when a translation between foundations is straight-forward (which it usually is not), the result of translating content between libraries on the basis of a translation between the foundations can yield awkward reformalizations of known content completely disconnected from the content already available in the target library.
Consequently, any generic methodology for integrating libraries needs to go beyond – or even entirely bypass – mere foundations and take distinct library developments into account. This part of this thesis presents our approach to library integration on that premise.
First, we introduce the notion of alignments in Chapter 14. This is a binary relation between symbols expressing that the two symbols (possibly from different libraries) denote the same abstract mathematical concept. In the simplest case, we can then translate symbol occurences in expressions across libraries by merely substituting along alignments. We then classify various more complex types of alignments (e.g. “up to argument order” or “up to additional arguments”) that occur in practice and develop a convenient way to specify them.
Chapter 15 generalizes this approach to translating expressions; using alignments wherever possible, but additionally using theory morphisms (within or across libraries) and – as is crucial for translating across different foundations – programmatic translations, which can be supplied for more complex situations.
To avoid having to provide ad-hoc traslation mechanisms between all pairs of libraries, we instead try to use translations between any library and a fixed set of interface theories, in line with the Math-in-the-Middle approach (see Section 3.4). In a library of interface theories, we can implement all possible developments and definitions of the same mathematical concept and connect them via theory morphisms, lifting the more complicated translation problems to a realm governed by a single, flexible foundation using the logical frameworks developed in Part II. Not surprisingly, this methodology is used to implement the functionalities aimed at by the OpenDreamKit project (see Section 3.3) – Section 15.3 shows a real-world use case from the OpenDreamKit community that is covered by our approach.
In Chapter 16, we describe how to automatically find theory morphisms; both within and across different libraries. These can be used to identify overlap between libraries and consequently suggest new alignments and translations between the theories; ideally resulting in a positive feedback loop of ever increasing connections between libraries.
Lastly, Chapter 17 develops a method for using theory morphisms to refactor library content to increase modularity within a given library.
Chapter 14
Alignments
Disclaimer:
The contents of this chapter have largely been previously published as [Mül+17b] (Extended in [Mül+17c]), [Kal+16] and [Mül+17a] with coauthors Cezary Kaliszyk, Thibault Gauthier, Michael Kohlhase, Florian Rabe, Colin Rothgang and Yufei Liu. Both the writing as well as the theoretical results were developed in close collaboration between the authors, hence it is impossible to precisely assign authorship to individual authors. The writing has been reworked for this thesis.
My contribution in this chapter consists of the actual implementations presented in Section 14.4 and the practical evaluation of the concept.
The sciences are increasingly collecting and curating their knowledge systematically in machine-processable corpora. For example, in biology many important corpora take the form of ontologies, e.g., as collected on BioPortal. These corpora typically overlap substantially, and much recent work has focused on integrating them. A central problem here is to find alignments: pairs
(
𝑎
1
,
𝑎
2
)
of identifiers from different corpora that describe the same concept, giving rise to ontology matching [ESC07].
In the certification of programs and proofs, the ontology matching problem is most apparent when trying to use multiple reasoning systems together. For example, Wiedijk [Wie06] explored a single theorem (and its proof) across 17 proof assistants implicitly generating alignments between the concepts present in the theorem’s statement and proof. The Why3 system [Bob+11] maintains a set of translations into different reasoning systems for discharging proof obligations. Each translation must manually code individual alignments of elementary concepts such as integers or lists in order to fully utilize the respective system’s automation potential. But automating the generation and use of alignments, which would be necessary to scale up such efforts, is challenging because the knowledge involves rigorous notations, definitions, and properties, which leads to very diverse corpora with complex alignment options. This makes it very difficult to determine whether an alignment is perfect (we will attempt to define this notion in the next section), or to predict whether an imperfect alignment will work just as well or not at all.
•
•
•
•
•
Translations along alignments will be covered in Chapter 15.
•
One such refactoring technique will be covered in Chapter 17.
Finding AlignmentsEven though we know that numerous alignments exist between libraries, not many alignments are known concretely or have been represented explicitly. Therefore, a major initial investment is necessary to obtain a large library of interface theories and alignments. There are three groups of alignment-finding approaches.
Human-based approaches examine libraries and manually identify alignments. This approach has been pursued ad hoc in various contexts. For example, the library translation of [OS06b] included some alignments for HOL Light and Isabelle/HOL, which were later expanded by Kaliszyk. The Why3 and FoCaLiZe systems include alignments to various theorem provers that they use as backends. Results from this approach will be discussed in Section 14.5.
The remaining two classes use machine learning methods.
Logical approaches align two concepts if they satisfy the same theorems. This cannot directly appeal to logical equivalence because that would require a translation between the corpora. Instead, they compare the theorems that are explicitly stated in the corpora. The theorems should be normalized first to eliminate differences that do not affect logical equivalence. The quality of the results depends on how completely those theorems characterize the respective concepts. In well-curated libraries, this can be very high [MK15]. The Viewfinder presented in Chapter 16 can be interpreted as implementing this approach.
Machine learning–based approaches are inherently based on statistical patterns and hence naturally inexact. The main research in this direction is carried out by Kaliszyk and others [GK14b].
Automatic Search for AlignmentsFinding alignments, preferably automatically, has proved extremely difficult in general. There are three reasons for this: the conceptual differences between logical corpora found in proof assistants, computational corpora containing algorithms from computer algebra systems, narrative corpora that consist of semi-formal descriptions from wiki-related tools; the diversity of the underlying formal languages and tools; and the differences between the organization of the knowledge in the corpora.
While finding alignments automatically is promising for perfect alignments, where two symbols only differ in name but are otherwise used identically, it is a different matter for imperfect alignments. For example, consider binary division which yields undefined or 0 when the divisor is zero, versus a strict division that requires an additional proof-argument that the divisor is nonzero. Here automation becomes much more difficult, because the imperfections often violate the conditions that an automatic approach uses to spot alignments.
I conjecture that the quality of artificial intelligence approaches will be massively improved if they are applied on top of a large set of guaranteed-perfect alignments, besides their obvious use as training data. This is based on two observations:
•
The more alignments we know, the easier it is to find new ones. Consider a typical formalization in system •
It is very difficult to get alignment-finding off-the-ground. Because the foundations of Recently, heuristic methods for automatically finding alignments were developed [GK14b], targeted at integrating logical corpora, which were integrated into our developments in Section 14.1. Independently, Deyan Ginev built a library [GC14] of about 50,000 alignments between narrative corpora including Wikipedia, Wolfram Mathworld, PlanetMath and the SMGloM semantic multilingual glossary for mathematics. For this, the NNexus system indexes the corpora and applies clustering algorithms to discover concepts.
Related WorkAlignments between computational corpora occur in bridges between the run time systems of programming languages. Alignments between logical and computational corpora are used in proof assistants with code generation such as Isabelle [WPN08] and Coq [Coq15]. Here functions defined in the logic are aligned with their implementations in the programming language in order to generate fast executable code from formalizations.
The dominant methods for integrating logical corpora so far have focused on truth-preserving translations between the underlying knowledge representation languages. For example, [KS10] translates from Isabelle/HOL to Isabelle/ZF. [KW10] translates from HOL Light to Coq, [OS06a] to Isabelle/HOL, and [NSM01] to Nuprl. Older versions of Matita [Asp+06a] were able to read Coq compiled theory files. [Cod+11] build a library of translations between different logics.
However, as mentioned in the introduction to this part, most translations are not alignment-aware, i.e., it is not guaranteed that
𝑎
1
will be translated to
𝑎
2
even if the alignment is known. This is because
𝑎
1
and
𝑎
2
may be subtly incompatible so that a direct translation may even lead to inconsistency or ill-typed results. [OS06a] was — to my knowledge — the first that could be parametrized by a set of alignments. The OpenTheory framework [Hur09] provides a number of higher-order logic concept alignments. In [KR16a], the corpus integration problem is discussed and concluded that alignments are of utmost practical importance. Indeed, corpus integration can succeed with only alignment data even if no logic translation is possible. Conversely, logic translations contribute little to corpus integration without alignment data.
14.1Types of Alignments
Let us assume two corpora
𝐶
1
,
𝐶
2
with underlying foundational logics
𝐹
1
,
𝐹
2
. We examine examples for how two concepts
𝑎
𝑖
from
𝐶
𝑖
can be aligned. Importantly, we allow for the case where
𝑎
1
and
𝑎
2
represent the same abstract mathematical concept without there being a direct, rigorous translation between them.
The types of alignments in this section are purely phenomenological in nature: they exemplify the difficulty of the problem and provide benchmarks for rigorous definitions. While some types are relatively straightforward, others are so difficult that giving a rigorous definitions remains an open problem. This is because alignments ideally legitimize translations from
𝐹
1
to
𝐹
2
that replace
𝑎
1
with
𝑎
2
. But in many situations these translations, while possible in principle, are much more difficult than simply replacing one symbol with another. The alignment types below are roughly ordered by increasing difficulty of this translation.
Perfect AlignmentIf
𝑎
1
and
𝑎
2
are logically equivalent (modulo a translation
𝜑
between
𝐹
1
and
𝐹
2
that is fixed in the context), we speak of a perfect alignment. More precisely, all formal properties (type, definition, axioms) of
𝑎
1
carry over to
𝑎
2
and vice versa. Typical examples are primitive types and their associated operations. Consider:
Na
t
1
∶
Type
Na
t
2
∶
Type
then translations between
𝐶
1
and
𝐶
2
can simply interchange
𝑎
1
and
𝑎
2
.
The above example is deceptively non-trivial for two reasons. Firstly, it hides the problem that
𝐹
1
and
𝐹
2
do not necessarily share the symbol
Type
. Therefore, we need to assume that there are symbols
Typ
e
1
and
Typ
e
2
, which have been already aligned (perfectly). Such alignments (on the foundations of
𝐶
1
and
𝐶
2
) are crucial for all fundamental constructors that occur in the types and characteristic theorems of the symbols we want to align such as
Type
,
→
,
bool
,
∧
, etc. These alignments can be handled with the same methodology as discussed here. Therefore, here and below, we assume we have such alignments and simply use the same fundamental constructors for
𝐹
1
and
𝐹
2
.
Secondly, it ignores that we usually want (and can reasonably expect) only certain formal properties to carry over, namely those in the interface theory in the sense of [KR16a] — i.e. those properties that are still meaningful after abstracting away from the specific foundational logics
𝐹
𝑖
. We will look at some examples for perfect alignments between symbols that use different but equivalent definitions in Section 14.2.
Alignment up to Argument OrderTwo function symbols can be perfectly aligned except that their arguments must be reordered when translating.
The most common example is function composition, whose arguments may be given in application order (
𝑔
○
𝑓
) or in diagram order (
𝑓
;
𝑔
). Another example is given
Here the expressions
contain
s
1
(
T
,
A
,
x
)
and
i
n
2
(
T
,
x
,
A
)
can be translated to each other.
|
|
|
|
Alignment up to Determined ArgumentsThe perfect alignment of two function symbols may be broken because they have different types even though they agree in most of their properties. This often occurs when
𝐹
1
uses a more fine-granular type system than
𝐹
2
, which requires additional arguments.
Examples are untyped and typed (polymorphic, homogeneous) equality: The former is binary, while the latter is ternary
.
The types can be aligned, if we apply
eq
2
to
𝜑
(
Set
)
. Similar examples arise between simply- and dependently-typed foundations, where symbols in the latter take additional arguments.
|
|
|
|
These additional arguments are uniquely determined by the values of the other arguments, and a translation from
𝐶
1
to
𝐶
2
can drop them, whereas the reverse translations must infer them – but
𝐹
1
usually has functionality for that (e.g. the type parameter of polymorphic equality is usually uniquely determined).
The additional arguments can also be proofs, used for example to represent partial functions as total functions, such as a binary and a ternary division operator
Here inferring the third argument is undecidable in general, and it is unique only in the presence of proof irrelevance.
|
|
|
DED
|
Alignment up to Totality of FunctionsThe functions
𝑎
1
and
𝑎
2
can be aligned everywhere where both are defined. This happens frequently since it is often convenient to represent partial functions as total ones by assigning values to all arguments. The most common example is division. Two implementations
div
1
and
div
2
might both have the type
Real
→
Real
→
Real
with
𝑥
div
1
0
undefined and
𝑥
div
2
0
=
0
.
Here a translation from
𝐶
1
to
𝐶
2
can always replace
div
1
with
div
2
. The reverse translation can usually replace
div
2
with
div
1
but not always. In translation-worthy data-expressions, it is typically sound; in formulas, it can easily be unsound because theorems about
div
2
might not require the restriction to non-zero denominators.
Alignment for Certain ArgumentsTwo function symbols may be aligned only for certain arguments. This occurs if
𝑎
1
has a smaller domain than
𝑎
2
.
The most fundamental case is the function type constructor
→
itself. For example,
→
1
may be first-order in
𝐹
1
and
→
2
higher-order in
𝐹
2
. Thus, a translation from
𝐶
1
to
𝐶
2
can replace
→
1
with
→
2
, whereas the reverse translation must be partial.
Another important class of examples is given by subtyping (or the lack thereof). For example, we could have
.
|
|
|
|
Alignment up to AssociativityAn associative binary function (either logically associative or notationally right- or left-associative) can be defined as a flexary function, i.e., a function taking an arbitrarily long sequence of arguments. In this case, translations must fold or unfold the argument sequence. For example
plu
s
1
∶
Nat
→
Nat
→
Nat
plu
s
2
∶
List
Nat
→
Nat
.
All of the above types of alignments allow us to translate expressions between our corpora by modifying the lists of arguments the respective symbols are applied to, even if not always in a straight-forward way. The following types of alignments are more abstract, and any translation along them might be more dependent on the specifics of the symbols under consideration.
Contextual AlignmentsTwo symbols may be aligned only in certain contexts. For example, the complex numbers are represented as pairs of real numbers in some proof assistant libraries and as an inductive data type in others. Then only selected occurrences of pairs of real numbers can be aligned with the complex numbers.
Alignment with a Set of DeclarationsHere a single declaration in
𝐶
1
is aligned with a set of declarations in
𝐶
2
. An example is a conjunction
𝑎
1
in
𝐶
1
of axioms aligned with a set of single axioms in
𝐶
2
. More generally, the conjunction of a set of
𝐶
1
-statements may be equivalent to the conjunction of a set of
𝐶
2
-statements.
Here translations are much more involved and may require aggregation or projection operators.
Alignment between the Internal and External Perspective on TheoriesWhen reasoning about complex objects in a proof assistant (such as algebraic structures, or types with comparison) it is convenient to express them as theories that combine the actual type with operations on it or even properties of such operations. The different proof assistants often have incompatible mechanisms of expressing such theories including type classes, records and functors, with the additional distinction whether they are first-class objects or not. This roughly corresponds to the distinction between stratified and integrated groupings in Chapter 8.
We define the crucial difference for alignments here only by example. We speak of the internal perspective (
≈
stratified grouping) if we use a theory like
theory
Magma
1
=
{
u
1
∶
Type
,
○
1
∶
u
1
→
u
1
→
u
1
}
and of the external perspective (
≈
integrated grouping) if we use operations like
Here we have a non-trivial, systematic translation from
𝐶
1
to
𝐶
2
. A reverse may also be possible, depending on the details of
𝐹
1
.
|
|
|
|
1
Corpus-Foundation AlignmentOrthogonal to all of the above, we have to consider alignments, where a symbol is primitive in one system but defined in another. More concretely,
𝑎
1
can be built-into
𝐹
1
whereas
𝑎
2
is defined in
𝐹
2
. This is common for corpora based on significantly different foundations, as each foundation is likely to select different primitives. Therefore, it mostly occurs for the most basic concepts. For example, the boolean connectives, integers and strings are defined in some systems but primitive in others, as in some foundations they may not be easy to define.
The corpus-foundation alignments can be reduced to previously considered cases if we follow the approach generally followed in this thesis, where the foundations themselves are represented in an appropriate logical framework. Then
𝑎
1
is simply an identifier in the corpus of foundations of the framework
𝐹
1
.
1
Opaque AlignmentsThe above alignments focused on logical corpora, partially because logical corpora allow for precise and mechanizable treatment of logical equivalence. Indeed, alignments from a logical into a computational or narrative corpus tend to be opaque: Whether and in what way the aligned symbols correspond to each other is not (or not easily) machine-understandable. For example, if
𝑎
2
refers to a function in a programming language library, that function’s specification may be implicit or given only informally. Even worse, if
𝑎
2
is a wiki article, it may be subject to constant revision.
Nonetheless, such alignments are immensely useful in practice and should not be discarded. Therefore, we speak of opaque alignments if
𝑎
2
refers to a symbol whose semantics is unclear to machines.
Probabilistic AlignmentsOrthogonal to all of the above, the correctness of an alignment may be known only to some degree of certainty. In that case, we speak of probabilistic alignments. These occur in particular when machine-learning techniques are used to find large sets of alignments automatically. This is critical in practice to handle the existing large corpora.
The problem of probabilistically estimating the similarity of concepts in different corpora was studied before in [GK14b]. I briefly restate the relevant aspects in our setting, as described by Thibault Gauthier in [Mül+17b].
Let
𝑇
𝑖
be the set of top level expressions occurring in
𝐶
𝑖
, e.g., the types of all constants and the formulas of all theorems. We assume a fixed set
𝐹
of alignments, covering in particular the foundational concepts in
𝐹
1
and
𝐹
2
.
Definition 14.1:
The pattern
𝑃
(
𝑓
)
of an expression
𝑓
is obtained by normalizing
𝑓
to
𝑁
(
𝑓
)
and abstracting over all occurrences of concepts that are not in
𝐹
, resulting in
𝑃
(
𝑓
)
=
𝜆
𝑐
1
...
𝑐
𝑛
.
𝑁
(
𝑓
)
. If two formulas
𝑓
∈
𝑇
1
and
𝑔
∈
𝑇
2
have
𝛼
-equivalent patterns
𝜆
𝑑
1
...
𝑑
𝑚
.
𝑁
(
𝑔
)
and
𝜆
𝑒
1
...
𝑒
𝑚
.
𝑁
(
ℎ
)
, we define their induced alignments by
𝐼
(
𝑓
,
𝑔
)
=
{(
𝑑
1
,
𝑒
1
)
,
...
,
(
𝑑
𝑚
,
𝑒
𝑚
)}
. We write
𝐽
(
𝑝
)
for the union of all
𝐼
(
𝑓
,
𝑔
)
with
𝑃
(
𝑓
)
=
𝛼
𝑃
(
𝑔
)
=
𝛼
𝑝
.
Example 14.1:
For the formula
∀𝑥.
𝑥
=
2
⋅
𝜋
⇒
(
𝑥
)
=
0
with
𝐹
not covering the concepts
2
,
𝜋
,
0
, and
, and using a normal form
𝑁
that exploits the symmetry of equality, we get the pattern
𝜆
𝑐
1
𝑐
2
𝑐
3
𝑐
4
.
∀𝑥.
𝑥
=
𝑐
1
⋅
𝑐
2
⇒
𝑐
3
=
𝑐
4
(
𝑥
)
.
cos
cos
Let
𝑎
1
,
...
,
𝑎
𝑛
be the set of all alignments in any
𝐽
(
𝑝
)
. We first calculate an initial vector containing the similarities
𝑠𝑖
𝑚
𝑖
for each
𝑎
𝑖
by
𝑠𝑖
𝑚
𝑖
=
∑
{
𝑝
|
𝑎
𝑖
∈
𝐽
(
𝑝
)}
1
(
2
+
𝑐𝑎𝑟𝑑
{
𝑓
|
𝑃
(
𝑓
)
=
𝑝
})
Intuitively, an alignment has a high similarity value if it was produced by a large number of rare patterns.
ln
Secondly, we iteratively transform this vector until its values stabilize. The idea behind this dynamical system is that the similarity score of an alignment should depend on the quality of its co-induced alignments. Each iteration step consists of two parts: we multiply the vector with the matrix
𝑐𝑜
𝑟
𝑘𝑙
=
𝑐𝑎𝑟𝑑
{
(
𝑓
,
𝑔
)
|
𝑎
𝑘
∈
𝐼
(
𝑓
,
𝑔
)
∧
𝑎
𝑙
∈
𝐼
(
𝑓
,
𝑔
)
}
which measures the correlation between
𝑎
𝑘
and
𝑎
𝑙
, and then (in order to ensure convergence and squash all values into the interval
[
0
;
1
]
) apply the function
𝑥
↦
𝑥
𝑥
+
1
to each component.
14.2Examples of Alignments
An essential requirement for relating logical corpora is standardizing the identifiers so that each identifier in the corpus can be uniquely referenced. It is desirable to use a uniform naming schema so that the syntax and semantics of identifiers can be understood and implemented as generically as possible. Mmt URIs have been specifically designed for that purpose, and Part III shows by example how to systematically generate URIs during importing a library. Additionally, in [Mül+17c] we present URI schemas for a list of selected theorem provers.
Using these URIs with abbreviated proof assistant names, the following presents alignments across proof assistants for two representative concepts. Also included are some alignments to programming languages, which are relevant for code generation. Thousands of other alignments, both perfect and imperfect, can be explored on mathhub [PRA].
Cartesian ProductIn constructive type theory, there are two common ways of expressing the non-dependent Cartesian product. First, if the foundation has inductive types such as the Calculus of Inductive Constructions, it can be an inductive type with one binary constructor. This is the case for:
•
Coq ? Init/Datatypes ? prod.ind •
Matita ? datatypes/constructors ? Prod.ind Second, if the foundation defines a dependent sum type, it is possible to express the Cartesian product as its non-dependent special case:
•
Isabelle ? CTT/CTT ? times In higher-order logic, the only way to introduce types is by using the typedef construction, which constructs a new type that is isomorphic to a certain subtype of an existing type. In particular, most HOL-based systems introduce the Cartesian product
𝐴
×
𝐵
by using a unary predicate on
𝐴
→
𝐵
→
bool
:
•
HOLLight ? pair/type ? prod •
HOL4 ? pair/type ? prod •
Isabelle ? HOL/Product ? prod In set theory, it is also possible to restrict dependent sum types to obtain the Cartesian product. This approach is used in Isabelle/ZF:
•
Isabelle ? ZF/ZF ? cart_prod In Mizar the Cartesian product is defined as functor in first-order logic. The definition invloves discharging the well-definedness condition. We defined functor is:
•
Mizar ? ZFMISC_1 ? K2 In PVS, the product type constructor is part of the system foundation:
•
PVS ? foundation.PVS ? tuple_tp Cartesian products appear also in most programming languages and the code generators of various proof assistants do use a number of these:
•
OCaml ? core ? * •
Haskell ? core ? , •
Scala ? core ? , •
CPP ? std ? pair Informal sources that can be aligned are e.g.:
Concatenation of ListsIn constructive type theory (e.g. for Matita, Coq), the append operation on lists can be defined as a fixed point. In higher-order logic, append for polymorphic lists can be defined by primitive recursion, as done by HOL Light and HOL4. Isabelle/HOL slightly differs from these two because it uses lists that were built with the co-datatype package [Bla+14]. PVS and Isabelle/ZF also use primitive recursion for monomorphic lists. In Mizar, lists are represented by finite sequences, which are functions from a finite subset of natural numbers (one-based FINSEQ and zero-based XFINSEQ) with append provided. Concatenation of lists is also common in programming languages.
•
Coq ? Init/Datatypes ? app •
HOLLight ? lists/const ? APPEND •
HOL4 ? list/const ? APPEND •
Isabelle ? HOL/List ? append •
PVS?Prelude.list_props?append •
Isabelle ? ZF/List_ZF ? app •
Mizar ? ORDINAL4 / K1 •
OCaml ? core ? @ •
Haskell ? core ? ++ •
Scala ? core ? ++ 14.3A Standard Syntax for Alignments
Based on the observations of the previous sections, we now define a standard for alignments. Because many of the alignment types described in Section 14.1 are very difficult to handle rigorously and additional alignment types may be discovered in the future, we opt for a very simple and flexible definition.
Concretely, we use the following formal grammar for collections of alignments:
|
Collection
|
::=
|
|
|
Comment
|
::=
|
// String
|
|
NSDef
|
::=
|
namespace String URI
|
|
Alignment
|
::=
|
URI URI
|
Our definition aims at practicality, especially considering the typical case where researchers exchange and manipulate large collections of alignments. Therefore, our grammar allows for comments and for the introduction of short namespace definitions that abbreviate long namespaces. Our grammar represents each individual alignment as a pair of two URIs with arbitrary additional data stored as a list of key-value pairs.
The additional data in alignments makes our standard extensible: any user can standardize individual keys in order to define specific types of alignments. For example, for alignments up to argument order, we can add a key for giving the argument order. Moreover, this can be used to annotate metadata such as provenance or system versions.
Below we standardize some individual keys and use them to implement the most important alignment types from Section 14.1. In all definitions below, we assume that
𝑎
1
and
𝑎
2
are the aligned symbols.
Definition 14.2:
The key
direction
has the possible values
forward
,
backward
, or
both
. Its presence legitimizes, respectively, the translation that replaces every occurrence of
𝑎
1
with
𝑎
2
, its inverse, or both.
Alignments with
direction
key subsume the alignment types of perfect alignments (where the direction is
both
) and the unidirectional types of alignment up to totality of functions or up to associativity, and alignment for certain arguments. The absence of this key indicates those alignment types where no symbol-to-symbol translation is possible, in particular opaque alignments.
Definition 14.3:
The key
arguments
has values of the form
(
𝑟
1
,
𝑠
1
)
...
(
𝑟
𝑘
,
𝑠
𝑘
)
where the
𝑟
𝑖
and
𝑠
𝑖
are natural numbers. Its presence legitimizes the translation of
𝑎
1
(
𝑥
1
,
...
,
𝑥
𝑚
)
to
𝑎
2
(
𝑦
1
,
...
,
𝑦
𝑛
)
where each
𝑦
𝑘
is defined by
•
if •
otherwise: inferred from the context. Alignments with
arguments
key subsume the alignment types of alignments up to argument order and of alignment up to determined arguments.
Example 14.2:
We obtain the following argument alignments for some of the examples from Section 14.1:
|
|
|
|
|
|
Definition 14.4:
The key
similarity
has values that are real numbers in
[
0
;
1
]
. If used together with other keys like
direction
and
arguments
, it represents a certainty score for the correctness of the corresponding translation. If absent, its value is
1
indicating perfect certainty.
14.4Implementation
I have implemented alignments in the Mmt system. Moreover, I have created a public repository [PRA] and seeded it with a number of alignments (currently
≈
12000
) including the ones mentioned above and below in Section 14.5, the README of this repository furthermore describes the syntax for alignments above as well as the URI schemata for several proof assistants. The Mmt system can be used to parse and serve all these alignments, implement the transitive closure, and (if possible) translate expressions according to alignments (see Chapter 15 for the details). Available alignments are shown in the Mmt browser.
As an example service, I built a prototypical alignment-based math dictionary collecting formal and informal resources.
For this we extend the above grammar by the following:
2
|
Alignment
|
::=
|
String URI
|
This assigns a mathematical concept (identified by the string) to a formal or informal resource (identified by the URI). The dictionary uses the above public repository, so additions to the latter will be added to the former. We have imported the
≈
50,000 conceptual alignments from [GC14], although we chose not to add them to the dictionary yet, since the majority of them are (due to the different intention behind the conceptual mappings in Nnexus) dubious, highly contextual or otherwise undesirable.
Each entry in the dictionary shows snippets from select online resources if available (Figure 14.1), lists the associated formal statements (Figure 14.2) and available alignments between them (Figure 14.3), and allows for conveniently adding new individual URIs to concept entries as well as new formal alignments (Figures 14.1 and 14.4 respectively).
14.5Manually Curated Alignments
As an experiment,we hired two mathematics students at Jacobs University in Bremen – Colin Rothgang and Yufei Liu – to manually comb through libraries of HOL Light, PVS, Mizar and Coq to find alignments. Specifically, they picked the mathematical areas of numbers, sets (as well as lists), abstract algebra, calculus, combinatorics, logic, topology, and graphs as a sample. This produced around 900 declarations overall, from which they constructed interface theories, as presented in Section 15.1. Notably neither of the two students had prior in-depth knowledge about theorem prover systems or their libraries.
Alignments in topology pose some additional difficulties. Firstly, HOL Light defines a topology on some subset of the universal set of a type, whereas PVS defines it on a type directly. Thus, the alignment from HOL Light to the interface theory is unidirectional. Secondly, Mizar does not define the notion of a topology, but instead the notion of a topological space. Therefore, the students aligned all these symbols to two different symbols in an interface theory (
topology
and
topological
space
) and defined
topological
space
based on
topology
.
_
_
The set of all alignments they found can be inspected at
𝚑𝚝𝚝𝚙𝚜
:
//
𝚐𝚕
.
𝚖𝚊𝚝𝚑𝚑𝚞𝚋
.
𝚒𝚗𝚏𝚘
/
𝚊𝚕𝚒𝚐𝚗𝚖𝚎𝚗𝚝𝚜
/
𝙿𝚞𝚋𝚕𝚒𝚌
/
𝚝𝚛𝚎𝚎
/
𝚖𝚊𝚜𝚝𝚎𝚛
/
𝚖𝚊𝚗𝚞𝚊𝚕
. Many of these alignments are by now outdated, due to recent developments on Mmt, including an import for Coq (see [MRS]) which had us rethink their Mmt URI schema, and changes in the surface syntax that require the archive of interface theories (discussed in Section 15.1) to be cleaned up and partially reimplemented. However, as an experiment, it shows that finding and implementing alignments manually is surprisingly easy even with very little prior knowledge of the formal systems.
During the course of collecting alignments, Rothgang and Liu have identified the following two aspects to be the most common causes for imperfect alignments:
•
•
2
|
|
||||
|
Concept
|
PVS (Standard)
|
HOL Light (Standard)
|
Mizar (Standard)
|
Coq (Standard)
|
|
|
||||
|
|
naturalnumbers?naturalnumber
|
nums?nums
|
ORDINAL1?modenot.6
|
Coq.Init.Datatypes?nat
|
|
successor function
|
naturalnumbers?succ
|
nums?SUC
|
ORDINAL1?func.1
|
Coq.Init.Nat?succ
|
|
addition
|
number_fields?+
|
arith?ADD
|
ORDINAL2?func.10
|
Coq.Init.Nat?add
|
|
multiplication
|
number_fields?*
|
arith?MULT
|
ORDINAL2?func.11
|
Coq.Init.Nat?mul
|
|
less than
|
number_fields?
|
arith?
|
XXREAL_0?pred.1
|
Coq.Init.Nat?leb
|
|
|
Table 14.1: Alignments for NaturalNumbers (libraries in brackets)
|
|
||||
|
Concept
|
PVS (NASA
3
|
HOL Light (Standard)
|
Mizar (Standard)
|
Coq (coq-topology
4
|
|
|
||||
|
topology
|
topology_prelim?topology
|
topology?topology
|
PRE_TOPC?modenot.1
|
TopologicalSpaces?TopologicalSpace
|
|
open
|
topology?open?
|
topology?open_in
|
PRE_TOPC?attr.3
|
TopologicalSpaces?open
|
|
closed
|
topology?closed?
|
topology?closed_in
|
PRE_TOPC?attr.4
|
TopologicalSpaces?closed
|
|
interior
|
topology?interior
|
topology?interior
|
TOPS_1?func.1
|
InteriorsClosures?interior
|
|
closure
|
topology?Cl
|
topology?closure
|
PRE_TOPC?func.2
|
InteriorsClosures?closure
|
|
|
Table 14.2: Alignments for Topology (libraries in brackets)
|
|
||||
|
Topic
|
HOL Light
|
PVS
|
Mizar
|
Coq
|
|
|
||||
|
Algebra
|
0/0
|
18/1
|
17/0
|
14/0
|
|
|
||||
|
Calculus
|
15/0
|
14/0
|
16/0
|
5/15
|
|
|
||||
|
Categories
|
0/0
|
0/0
|
9/1
|
5/0
|
|
|
||||
|
Combinatorics
|
24/0
|
15/0
|
1/0
|
1/0
|
|
|
||||
|
Complex Numbers
|
9/2
|
4/6
|
7/2
|
11/2
|
|
|
||||
|
Graphs
|
5/5
|
17/0
|
20/0
|
7/2
|
|
|
||||
|
Integers
|
10/0
|
0/0
|
5/2
|
47/3
|
|
|
||||
|
Lists
|
16/0
|
9/0
|
8/0
|
36/2
|
|
|
||||
|
Logic
|
7/0
|
7/5
|
7/0
|
24/1
|
|
|
||||
|
Natural Numbers
|
19/0
|
8/10
|
9/0
|
34/1
|
|
|
||||
|
Polynomials
|
4/0
|
1/0
|
7/0
|
0/0
|
|
|
||||
|
Rational Numbers
|
0/14
|
2/11
|
0/10
|
14/3
|
|
|
||||
|
Real Numbers
|
13/2
|
3/10
|
7/4
|
12/2
|
|
|
||||
|
Relations
|
4/0
|
16/5
|
18/3
|
1/12
|
|
|
||||
|
Sets
|
23/0
|
28/0
|
18/0
|
19/0
|
|
|
||||
|
Topology
|
15/0
|
10/0
|
9/0
|
17/1
|
|
|
||||
|
Vectors
|
13/0
|
7/0
|
15/0
|
0/0
|
|
|
||||
|
Sum
|
177/23
|
159/48
|
173/22
|
240/42
|
|
|
Table 14.3: Number of Bidirectional/Unidirectional Alignments per Library
Chapter 15
Alignment-based Translations Using Interface Theories
Disclaimer:
A significant part of the contents of this chapter (except for Section 15.3) has been previously published as [Mül+17a] with coauthors Florian Rabe, Colin Rothgang and Yufei Liu. Both the writing as well as the theoretical results were developed in close collaboration between the authors, hence it is impossible to precisely assign authorship to individual authors. In particular, Section 15.1 is primarily the result of a student experiment executed by Rothgang and Liu, under joint supervision by Rabe and me. The writing has been reworked for this thesis.
My contribution in this chapter consists of the actual implementation presented in Section 15.2, which I have rewritten from scratch for this thesis. Additionally, I have implemented the alignments, interface theories and translators for the use case in Section 15.3.
Using alignments, we can equivocate symbols from different libraries with each other. Ideally, in the case of a perfect alignment we can then reduce translation to a one-to-one symbol substitution. However, given a multitude of libraries in our framework, naively approaching the task of aligning them in the first place still means sifting through two libraries in parallel, implying
𝑛
2
alignment tasks for
𝑛
libraries.
In recent work, our research group has come to understand this problem more clearly and suggested a systematic solution. Firstly, in [KRSC11] and [KR16b], Michael Kohlhase, Florian Rabe and Claudio Sacerdoti Coen developed the idea of interface theories. Using an analogy to software engineering, we can think of interface theories as specifications and of theorem prover libraries as implementations of formal knowledge.
Secondly, in the OpenDreamKit project [Deh+16; ODK] (see Section 3.3) we pursue the same approach (Math-in-the-Middle, see Section 3.4) in the context of computer algebra systems. In this context, we have already developed some interface theories for basic logical operations such as equality, with approximately 300 alignments to theorem prover libraries.
Not surprisingly, the methodology described in this chapter, while in theory originally developed for fully formal libraries originating from theorem prover systems, has consequently been used to realize the objectives of OpenDreamKit as well.
While alignments have the big advantage that they are cheap to find and implement, they have the disadvantage of not being very expressive; in fact, the definition of alignment itself is (somewhat intentionally) vague. In particular when a translation needs to consider foundational aspects beyond the individual system dialects, alignments alone are insufficient. This is where interface theories come into play: given different implementations of the same mathematical concept, their interface theory contains only those symbols that are a) common to all implementations and b) necessary to use the concept (as opposed to formalizing it in detail) – which in practice turn out to be the same thing.
Libraries of interface theories hence must critically differ from typical theorem prover libraries: they must follow the axiomatic method (as opposed to the method of definitional extensions), be written with minimal foundational commitment, and largely not rely on definitions or proofs in order to stay compatible with development approaches based on alternative definitions.
Using interface theories and the Math-in-the-Middle approach, we can see two symbols being aligned via a sequence of consecutive steps of abstraction:
1.
2.
3.
Consider for a simple example the PVS symbol
member
and the HOL Light symbol
IN
, we capture this with
•
a symbol for elementhood in a (new) interface theory for sets, •
two alignments of this new symbol with the symbols This yields a star-shaped network as in the diagram in Figure 15.1 with various formal libraries on the outside, which are connected by alignments via representations of their system dialects (lower-case letters) to the interface theory in the center. Note the very deliberate similarity to Figure 3.4 – the primary difference is that we substituted OpenDreamKit systems with OAF systems (see Section 3.5).
15.1Interface Theories
As mentioned in Section 14.5, we had two students (Colin Rothgang and Yufei Liu) collect alignments between various theorem prover libraries. In the course of this project, they also developed interface theories for the symbols they aligned.
As a basis for this experiment, they used the foundational theory for the Math-in-the-Middle archive, a simplified excerpt of which is shown in Listing 15.1, as an interface for basic logical constants.
Listing 15.1: An Interface Theory for Logic in Mmt
1
This theory actually includes most of the features presented in Part II, in order to be as flexible as possible in formalizing content – although most of them were not yet available when the experiment described in this section was originally done.
15.1.1Example: Natural Numbers
Listing 15.2 shows an example of a simple theory of natural numbers, using the theory
Logic
and the symbols declared therein. Note, that this theory is basically an implementation of the Peano axioms, which can be seen as the interface theory for all possible definitions of a concrete set (or type) of “the” natural numbers.
Listing 15.2: An (excerpt of an) interface theory for natural numbers with Logic as meta-theory
2
In Mizar [Miz], which is based on Tarski-Grothendieck set theory, the set of natural numbers
NAT
is defined as omega, the set of finite ordinals. Arithmetic operations are defined directly on those.
In contrast, in PVS (see Chapter 10) natural numbers are defined as a specific subtype of the integers, which in turn are a subtype of the rationals etc. up to an abstract type
number
which serves as a maximal supertype to all number types. The arithmetic operations are inherited from a subtype
number
field
of
number
.
_
These are two fundamentally different approaches to describe and implement an abstract mathematical concept, but for all practical purposes the concept they describe is the same; namely the natural numbers. The interface theory for both variants would thus only contain the symbols that are relevant to the abstract concept itself, independent of their specific implementation – hence, things like the type of naturals, the arithmetic operations and the Peano axioms. The interface theory thus provides everything we need to work with natural numbers, and at the same time everything we know about them independently of the logical foundation or their specific implementation within any given formal system.
2
However, there is an additional layer of abstraction here, namely that in stating that the natural numbers in Mizar are the finite ordinals we have already ignored the system dialect (in the sense of item 2 in the introduction to this chapter). This step of abstraction (from the concrete definition using only Mizar-specific symbols) yields another interface theory for finite ordinals, which in turn can be aligned not just with Mizar natural numbers, but also e.g. with MetaMath [MeMa], which is built on ZFC set theory.
Figure 15.2 illustrates this situation. Blue arrows point from more detailed theories to their interfaces. The arrows from PVS or Mizar to interfaces merely strip away the system dialects; the arrows within Interfaces abstract away more fundamental differences in definition or implementation.
Consider again Listing 15.2, a possible interface theory for natural numbers. Note, that symbols such as
leq
could be defined, but don’t actually need to be. Since they are only interfaces, all we need is for the symbols to exist.
In fact, the more abstract the interface, the less we want to define the symbols – given that there’s usually more than one way to define symbols, definitions are just one more thing we might want to abstract away from completely. There are exceptions to this – note that the symbol
zero
is actually defined as the literal
0
.
This is because literals themselves are mere Mmt terms, but not symbols with a stable Mmt-URI, which we need for alignments. On the other hand, not having literals excludes those systems where literals are actually used. In this case, providing a definition to the symbols enables aligning both systems with and without literals.
3
The symbols in this interface theory can then be aligned either with symbols in other formal systems directly, or with additional interfaces in between, such as a theory for Peano arithmetic, or the intersection of all inductive subsets of the real numbers, or finite ordinals or any other possible formalization of the natural numbers.
15.1.2Additional Interface Theories
The foundation independent nature of Mmt allows us to implement interface theories with almost arbitrary levels of detail and for vastly different foundational settings.
We have started a repository of interface theories specifically for translation purposes [Mitb] and also aligned to already existing interfaces (as in the case of arithmetics, see below) in a second and third MathHub repository [Mitc] and [Mita] extending them when necessary.
Crucially, this interface repository contains interface theories for basic type-related symbols like the function type constructors (see Listing 15.3
), that are aligned with the respective symbols in HOL Light and PVS. These symbols are so basic as to be primitive in systems based on type theory, and consequently they occur in the vast majority of expressions. To have these symbols aligned is strictly necessary to get any further use of alignments off the ground.
4
3
4
Listing 15.3: Interface Theories for Type-Theoretical Foundations
5
Here, a structure is used to include the theory for simple function types in the theory for dependent function types, while providing definitions for the symbols in terms of the latter. This automatically yields a translation from the simple to the dependent variant and demonstrates the advantage of using interface theories for translations: The general problem of converting between (systems using) simple and dependent function types is lifted to the level of the interface theories, where we can use theory morphisms (such as structures) to specify the appropriate translation once and for all; independent of any specific external system.
Table 14.3 shows the total number of alignments the students found for PVS, HOL Light and Mizar. The following are some additional examples of mathematical areas covered by the resulting interface theories. The numbers here reflect the result of the student experiment and refer to the number of symbols actually used for alignments therein; the theories themselves have been partially adapted and significantly extended by now:
Limitscontains 17 symbols related to sequences and limits, including
metric
space
and
complete
(metric spaces and their completeness).
_
5
Differentiationcontains 4 symbols, namely differentiability in a point and on a set and the derivative in a point and as a function
Integrationcontains 6 symbols, namely integrability and the integral over a set for Riemann, Lebesgue and Gauge-integration.
For Arithmetics we use the already existing theories from [Mitc]. It contains interfaces for below number arithmetics (each split into two interfaces for the basic number type definitions and the arithmetics on them).
Complex Numberscontains 11 symbols for complex numbers aligned to their counterparts in HOL Light, PVS and Mizar. Besides the usual arithmetic operations similar to
NaturalNumbers
, it contains
i
(the imaginary unit),
abs
(the modulus of a complex number) and
Re
,
Im
(the real and imaginary parts of a complex number).
Integerscontains 9 symbols for the usual arithmetic operations on integers and for comparison between two integers.
Listscontains the 13 most important symbols for lists, including
head
,
tail
,
concat
,
length
and
filter
(filter a list using another list) as well as some auxiliary definitions. There are no lists in Mizar, instead finite sequences are used. These however deserve their own interface.
For Logic we use the already existing theories in the [Mita] repository. It contains 9 symbols for boolean algebra that are all perfectly aligned to HOL Light, PVS and Mizar, and sometimes also to Coq.
Setsare also already existent in [Mitc], split into many subtheories. 28 of the contained symbols have been aligned. sets contains symbols for typed sets as in a type theoretical setting, including axioms and theorems. Here we have the most alignments so far. It also contains the following two interfaces:
Functionscontains 7 symbols for alignments to functions and their relations, that are not already contained in relations.
Topologycontains 25 symbols for both general topological spaces as well as the standard topology on
R
𝑛
specifically.
In order to translate an expression from one library to another, the concepts in the expression must at least exist in both libraries. This creates the need to inspect the intersection of the concepts in these libraries. Table 15.1 gives an overview of the library intersection for various interface theories.
|
|
||||
|
Topic
|
1 System
|
2 Systems
|
3 Systems
|
4 Systems
|
|
|
||||
|
Algebra
|
17
|
9
|
5
|
0
|
|
|
||||
|
Calculus
|
35
|
7
|
8
|
0
|
|
|
||||
|
Categories
|
4
|
5
|
0
|
0
|
|
|
||||
|
Combinatorics
|
25
|
6
|
0
|
1
|
|
|
||||
|
Complex Numbers
|
10
|
5
|
3
|
3
|
|
|
||||
|
Graphs
|
72
|
6
|
3
|
0
|
|
|
||||
|
Integers
|
52
|
2
|
7
|
0
|
|
|
||||
|
Lists
|
28
|
8
|
9
|
0
|
|
|
||||
|
Logic
|
18
|
0
|
2
|
5
|
|
|
||||
|
Natural Numbers
|
53
|
2
|
10
|
2
|
|
|
||||
|
Polynomials
|
12
|
0
|
0
|
0
|
|
|
||||
|
Rational Numbers
|
11
|
4
|
2
|
7
|
|
|
||||
|
Real Numbers
|
9
|
3
|
5
|
5
|
|
|
||||
|
Relations
|
21
|
15
|
4
|
0
|
|
|
||||
|
Sets
|
56
|
10
|
9
|
10
|
|
|
||||
|
Topology
|
62
|
2
|
8
|
0
|
|
|
||||
|
Vectors
|
25
|
5
|
0
|
0
|
|
|
||||
|
Sum
|
510
|
91
|
79
|
33
|
|
|
Table 15.1: Number of Concepts Found in Exactly One, Two, Three or Four Systems
Figure 15.3 shows a small part of the theory graph of the MitM libraries. The full MitM theory graph can be explored on MathHub
.
6
15.2Implementation
I have implemented a prototypical expression translator in the Mmt system that uses alignments, theory morphisms and programmatic tactics to translate expressions in the form of Mmt terms. The translation mechanism is already used in the OpenDreamKit project and is exposed via the Mmt query language [Rab12].
Crucially, the algorithm returns partial translations when no full translation to a target library can be found. These can be used for finding new alignments automatically by the techniques described in Chapter 16.
1.
2.
3.
We can obtain translators in several ways:
•
”
”
•
”
”
•
•
–
–
•
The Algorithm
Translate
(
𝑡
,
Fin
,
𝒯
)
∶
(Term,Boolean) then reduces to a simple tree search, iteratively applying applicable translators in
𝒯
to subterms of
𝑡
until
Fin
is satisfied, or return a partial translation if no full translation of
𝑡
can be found.
This implementation is naive and potentially inefficient and should be considered prototypical, but suitably demonstrates the capabilities of the general approach.
Before we cover one example in detail, Table 15.2 lists three rather simple example translations between PVS and HOL Light. We use the Pi-types of LF to bind variables on the outside of the intended expressions.
Note that even something like the application of a function entails a translation from function application in HOL Light to function application in PVS, since they belong to their respective system dialects. Furthermore, the translation from simple function types in HOL Light to the dependent
Π
-type in PVS is actually done on the level of interface theories, where simple function types are defined as a special case of dependent function types (see Section 15.1.2). Unlike the latter two, the first example uses only primitive symbols in both systems that are not part of any external library. Meta-variables have been left out in the third example for brevity.
Furthermore, an argument alignment was used in the third example, that switches the two (non-implicit) arguments – namely the predicate and the list. Whereas this usually trips up automated translation processes or needs to be manually implemented, in our case this is as easy as adding (in this case) the key-value-pair
arguments
=
‘‘
(
2
,
3
)
(
3
,
2
)
”
to the alignment.
|
|
|
|
HOL Light
|
PVS
|
|
|
|
|
{A:holtype, P:term A
|
{A:tp,P:expr
_:A
boolean
|
|
|
|
|
{T:holtype,a:term T,A:term T
|
{T:tp, a:expr T, A:expr
_:T
|
|
|
|
|
FILTER (Abs x:bool. x) c :: b :: a :: NIL
|
filter (c :: b :: a :: null) (
|
|
|
Table 15.2: Three Expressions Translated
15.3Example
To exemplify the usefulness of such translations across libraries, we will examine one example with computer algebra systems – which the OpenDreamKit research community has come to call Jane’s Use Case [Koh+17a] – in detail. This example is interesting in that it was (chiefly) developed by John Cremona as a real-world situation a computational group theorist might find themself in, and where they would want to use a combination of computer algebra systems.
A user, Jane, wants to experiment with invariant theory of finite groups. She does so, by considering some group
𝐺
as acting on the variables
𝑋
1
,
...
,
𝑋
𝑛
of the polynomial ring
Z
[
𝑋
1
,
...
,
𝑋
𝑛
]
and studying the ideals generated by the orbit of some polynomial, since these happen to be fixed by the group action. Naturally, in a computational setting one would want to have the Gröbner bases of these ideas as well. GAP happens to be efficient in computing orbits, whereas Singular is known to be very efficient in computing Gröbner bases. For the purposes of this section (and in light of Chapter 12) we will substitute Singular by Sage.
Starting from the Math-in-the-Middle ontology (see Section 3.4), Jane wants to study the dihedral group
𝐷
4
. So she declares a polynomial
𝑝
– exemplary
3
𝑋
1
+
2
𝑋
2
– over
𝑅
=
Z
[
𝑋
1
,
𝑋
2
,
𝑋
3
,
𝑋
4
]
, and wants to compute the orbit
𝑂
(
Z
,
𝐷
4
,
𝑝
)
in GAP, translate the resulting ideal back to MitM as
𝐼
, declare its Gröbner base
𝐺
=
Groebner
(
Z
,
𝐼
)
, translate this to Sage in order to compute it and have the result presented to her back in MitM.
There are two things that make this example particularly non-trivial:
1.
2.
The polynomial
3
𝑋
1
+
2
𝑋
2
is formally constructed in GAP using the (internal) syntax
Pol
y
(
RationalFunctions
(
FamilyObject
(
1
))
,
[[
1
,
1
]
,
3
,
[
2
,
1
]
,
2
])
,
where
[
...
]
represents lists, and
[[
𝑘
1
,
𝑛
1
,
...
,
𝑘
𝑚
,
𝑛
𝑚
]
,
𝑎
]
represents the monomial
𝑎
𝑋
𝑘
1
𝑛
1
⋅
...
⋅
𝑋
𝑘
𝑚
𝑛
𝑚
. The precise meaning of the
FamilyObject
and
RationalFunctions
symbols is technical and largely irrelevant.
GAP
In Sage, the same polynomial is instead constructed as
Pol
y
(
PolyRing
(
Z
,
[
X
1
,
X
2
,
X
3
,
X
4
])
,
[[[
1
,
0
,
0
,
0
]
,
3
]
,
[[
0
,
1
,
0
,
0
]
,
2
]])
,
where
[[
𝑖
1
...
𝑖
𝑛
]
,
𝑎
]
represents the monomial
𝑎
𝑋
1
𝑖
1
⋅
...
⋅
𝑋
𝑛
𝑖
𝑛
. Consequently, monomials refer to the specific named variables of the polynomial ring
Z
[
𝑋
1
,
𝑋
2
,
𝑋
3
,
𝑋
4
]
provided as an argument.
Sage
In general and not surprisingly, there seems to be no universally agreed upon method to formally represent polynomials.
The symbols for lists, integers, Gröbner bases, ideals (which in a computational setting can be represented as a list of generators) and orbits can be easily expressed as typed constants in an interface theory and connected via alignments, so we can focus on those aspects of the translation that are non-trivial.
The discrepancy regarding dihedral groups is reflective of different notational conventions used in the mathematical community, hence it makes sense to have both of them represented in an interface. Correspondingly, for the interface theory in the Math-in-the-Middle library we use both and simply use definitions to convert between them:
Listing 15.4: An Interface Theory for Dihedral Groups
Which allows us to use simple alignments to translate from GAP/Sage to MitM and definition expansion to translate from the second to the first variant. For the other direction, we need a theory morphism, e.g. a view:
Listing 15.5: An View for Translating Dihedral Groups
Regarding converting polynomials, we are stuck with using a programmatic translator instead. For a MitM representation of polynomials, we use a more generic method that carries the names of the variables:
Listing 15.6: An Interface Theory for Integer Polynomials
7
The polynomial
3
𝑋
1
+
2
𝑋
2
would then be represented as
int
polynomial
([
int
monomial
(
⟨
[
⟨
X1
,
1
⟩
]
,
3
⟩
)
,
int
monomial
(
⟨
[
⟨
X2
,
1
⟩
]
,
2
⟩
)])
.
_
_
“
”
_
“
”
The only thing missing then are programmatic translators from GAP and Sage to MitM and their inverses. As an example, Listing 15.7 shows the translator from MitM polynomials to GAP polynomials, using helper objects (analogous to those described in Section 5.2.1).
Listing 15.7: A Translator from MitM Polynomials to GAP
8
Given these translators, Jane’s Use Case can be realized as a sequence of applications of the translation algorithm described in Section 15.2.
Figure 15.4 shows the realization of this use case in a Jupyter [Jup] notebook running an Mmt kernel. Here, the expression
Evaluate
(
S
,
t
)
is used to instruct the Mmt system to translate the expression
𝑡
to the ontology of system
𝑆
and have the target system evaluate the expression.
7
8
Chapter 16
Viewfinding
Disclaimer:
A significant part of the contents of this chapter has been previously published as [MKR18] with coauthors Michael Kohlhase and Florian Rabe. The writing was developed in close collaboration between the authors, hence it is impossible to precisely assign authorship to individual authors.
My contribution in this chapter consists of developing the algorithm presented in this chapter, its implementation and running and evaluating the case study in Section 16.3.
Theory morphisms underly the notion of theory graphs. They are one of the primary structuring tools employed in Mmt and enable the kind of modularity we aimed for in this thesis, most notably Part II. As they induce new theorems in the target theory for any of the source theory, views (as opposed to includes and other trivial morphisms, see Section 4.1.1) in particular are high-value elements of a modular formal library. Usually, these are manually encoded, but this practice requires authors who are familiar with source and target theories at the same time, which limits the scalability of the manual approach.
To remedy this problem, I have developed a view finder algorithm that automates theory morphism discovery, and implemented it in Mmt. While this algorithm conceptually makes use of the definitional property of theory morphisms that they are to preserve typing judgements, it ultimately operates on the level of individual declarations after preprocessing which may include translations as in Chapter 15 using alignments – as such, it can also be thought of as an alignment finder across libraries, simultaneously utilizing and extending the set of possible translations between them.
Here, we focus on one specific application of theory classification, where a user can check whether a (part of a) formal theory already exists in some library, potentially avoiding duplication of work or suggesting an opportunity for refactoring.
The basic use case is the following: Mary, a mathematician, becomes interested in a class of mathematical objects, say – as a didactic example – something she initially calls “beautiful subsets” of a base set
ℬ
(or just “beautiful over
ℬ
”). These have the following properties
𝑄
:
1.
the empty set is beautiful over 2.
every subset of a beautiful set is beautiful over 3.
If To see what is known about beautiful subsets, she types these three conditions into a theory classifier, which computes any theories in a library
ℒ
that match these (after a suitable renaming). In our case, Mary learns that her “beautiful sets” correspond to the well-known structure of matroids [MWP], so she can directly apply matroid theory to her problems.
In extended use cases, a theory classifier may find theories that share significant structure with
𝑄
, so that Mary can formalize
𝑄
modularly with minimal effort. Say Mary was interested in “dazzling subsets”, i.e. beautiful subsets that obey a fourth condition, then she could just contribute a theory that extends the theory
matroid
by a formalization of the fourth condition – and maybe rethink the name.
Existing systems have so far only worked with explicitly given views, e.g., in IMPS [FGT93] or Isabelle [Isa]. Automatically and systematically searching for new views was first undertaken in [NK07] in 2006. However, at that time no large corpora of formalized mathematics were available in standardized formats that would have allowed easily testing the ideas in practice.
This situation has changed since then as multiple such exports have become available. In particular, we now have OMDoc/Mmt as a uniform representation language for such corpora. Building on these developments, we are now able, for the first time, to apply generic methods — i.e., methods that work at the Mmt level — to search for views in formal libraries.
While inspired by the ideas of [NK07], the design and implementation presented here are completely novel. In particular, the theory makes use of the rigorous language-independent definitions of theory and view provided by Mmt.
[GK14a] applies techniques related to ours to a related problem. Instead of views inside a single corpus, they use machine learning to find similar constants in two different corpora. Their results can roughly be seen as a single partial view from one corpus to the other.
16.1The Viewfinder Algorithm
Let
𝐶
be a corpus of theories with (for now) the same fixed meta-theory
𝑀
. We call the problem of finding views between theories of
𝐶
the view finding problem and an algorithm that solves it a view finder. Note that a view finder is sufficient to solve the theory classification use case above: Mary provides a
𝑀
-theory
𝑄
of beautiful sets, the view finder computes all (total) views from
𝑄
into
𝐶
.
Efficiency ConsiderationsThe cost of this problem quickly explodes. First of all, it is advisable to restrict attention to simple views (see Section 4.1.1). Eventually we want to search for arbitrary views as well. But that problem is massively harder because it subsumes theorem proving: a view from
Σ
to
Σ
′
maps
Σ
-axioms to
Σ
′
-proofs, i.e., searching for a view requires searching for proofs.
Secondly, if
𝐶
has
𝑛
theories, we have
𝑛
2
pairs of theories between which to search. (It is exactly
𝑛
2
because the direction matters, and even views from a theory to itself are interesting.) Moreover, for two theories with
𝑚
and
𝑛
constants, there are
𝑛
𝑚
possible simple views (It is exactly
𝑛
𝑚
because views may map different constants to the same one.) Thus, we can in principle enumerate and check all possible simple views in
𝐶
. But for large
𝐶
, it quickly becomes important to do so in an efficient way that eliminates ill-typed or uninteresting views early on.
Thirdly, it is desirable to search for partial views as well. In fact, identifying refactoring potential in libraries is only possible if we find partial views: then we can refactor the involved theories in a way that yields a total view (see Chapter 17). Moreover, many proof assistant libraries do not follow the little theories paradigm or do not employ any theory-like structuring mechanism at all. These can only be represented as a single huge theory, in which case we have to search for partial views from this theory to itself – facilitating the aforementioned refactoring technique. While partial views can be reduced to and then checked like total ones, searching for partial views makes the number of possible views that must be checked much larger.
Finally, even for a simple view, checking reduces to a set of equality constraints, namely the constraints
⊢
Σ
′
𝜎
(
𝐸
)
=
𝐸
′
for the type-preservation condition (see Section 4.1.1). Depending on
𝑀
, this equality judgment may be undecidable and require theorem proving.
A central motivation for my algorithm is that equality in
𝑀
can be soundly approximated very efficiently by using a normalization function on
𝑀
-expressions. This has the additional benefit that relatively little meta-theory-specific knowledge is needed, and all such knowledge is encapsulated in a single well-understood function. This way we can implement view–search generically for arbitrary
𝑀
.
The algorithm consists of two steps. First, we preprocess all constant declarations in
𝐶
with the goal of moving as much intelligence as possible into a step whose cost is linear in the size of
𝐶
. Then, we perform the view search on the optimized data structures produced by the first step.
16.1.1Preprocessing
The preprocessing phase computes for every constant declaration
𝑐
∶
𝐸
a normal form
𝐸
′
and then efficiently stores the abstract syntax tree (defined below) of
𝐸
′
. Both steps are described below.
Normalizationinvolves two steps: Mmt-level normalization performs generic transformations that do not depend on the meta-theory
𝑀
. These include flattening and definition expansion. Importantly, we do not fully eliminate defined constant declarations
𝑐
∶
𝐸
=
𝑒
from a theory
Σ
: instead, we replace them with primitive constants
𝑐
∶
𝐸
and replace every occurrence of
𝑐
in other declarations with
𝑒
. If
Σ
is the domain theory, we can simply ignore
𝑐
∶
𝐸
(because views do not have to provide an assignment to defined constants). But if the
Σ
is the codomain theory, retaining
𝑐
∶
𝐸
increases the number of views we can find; in particular in situations where
𝐸
is a type of proofs, and hence
𝑐
a theorem.
Meta-theory-level normalization applies an
𝑀
-specific normalization function. In general, we assume this normalization to be given as a black box. However, because many practically important normalization steps are widely reusable, we provide a few building blocks, from which specific normalization functions can be composed. Skipping the details, these include:
1.
Top-level universal quantifiers and implications are rewritten into the function space of the logical framework using the Curry-Howard correspondence. 2.
The order of curried domains of function types is normalized as follows: first all dependent argument types are ordered by the first occurrence of the bound variables; then all non-dependent argument types 3.
Implicit arguments, whose value is determined by the values of the others are dropped, e.g. the type argument of an equality. This has the additional benefit of shrinking the abstract syntax trees and speeding up the search. 4.
Equalities are normalized such that the left hand side has a smaller abstract syntax tree. Above multiple normalization steps make use of a total order on abstract syntax trees. We omit the details and only remark that we try to avoid using the names of constants in the definition of the order — otherwise, declarations that could be matched by a view would be normalized differently. Even when breaking ties between requires comparing two constants, we can first try to recursively compare the syntax trees of their types.
Abstract Syntax TreesWe define abstract syntax trees as pairs
(
𝑡
,
𝑠
)
where
𝑡
is subject to the grammar
𝑡
∶∶=
𝐶
𝑁𝑎𝑡
|
𝑉
𝑁𝑎𝑡
|
𝑡
[
𝑡
+
]
(
𝑡
+
)
(where
𝑁𝑎𝑡
is a non-terminal for natural numbers) and
𝑠
is a list of constant names.
We obtain an abstract syntax tree from an MMT expression
𝐸
by (i) switching to de-Bruijn representation of bound variables and (ii) replacing all occurrences of constants with
𝐶
𝑖
in such a way that every
𝐶
𝑖
refers to the
𝑖
-th element of
𝑠
.
Abstract syntax trees have the nice property that they commute with the application of simple views
𝜎
: If
(
𝑡
,
𝑠
)
represents
𝐸
, then
𝜎
(
𝐸
)
is represented by
(
𝑡
,
𝑠
′
)
where
𝑠
′
arises from
𝑠
by replacing every constant with its
𝜎
-assignment.
The above does not completely specify
𝑖
and
𝑠
yet, and there are several possible canonical choices among the abstract syntax trees representing the same expression. The trade-off is subtle because we want to make it easy to both identify and check views later on. We call
(
𝑡
,
𝑠
)
the long abstract syntax tree for
𝐸
if
𝐶
𝑖
replaces the
𝑖
-th occurrence of a constant in
𝐸
when
𝐸
is read in left-to-right order. In particular, the long tree does not merge duplicate occurrences of the same constant into the same number. The short abstract syntax tree for
𝐸
arises from the long one by removing all duplicates from
𝑠
and replacing the
𝐶
𝑖
accordingly.
Example 16.1:
Consider the axiom
∀𝑥
∶
set
,
𝑦
∶
set
.
beautiful
(
𝑥
)
∧
𝑦
⊆
𝑥
⇒
beautiful
(
𝑦
)
with internal representation
∀
[
x
∶
set
,
y
∶
set
]
(
⇒
(
∧
(
beautiful
(
x
)
,
⊆
(
y
,
x
)
)
,
beautiful
(
y
)
)
)
.
The short syntax tree and list of constants associated with this term would be:
𝑡
=
𝐶
1
[
𝐶
2
,
𝐶
2
]
(
𝐶
3
(
𝐶
4
(
𝐶
5
(
𝑉
2
)
,
𝐶
6
(
𝑉
1
,
𝑉
2
)
)
,
𝐶
5
(
𝑉
1
)
)
)
𝑠
=
(
∀
,
set
,
⇒
,
∧
,
beautiful
,
⊆
)
The corresponding long syntax tree is :
𝑡
=
𝐶
1
[
𝐶
2
,
𝐶
3
]
(
𝐶
4
(
𝐶
5
(
𝐶
6
(
𝑉
2
)
,
𝐶
7
(
𝑉
1
,
𝑉
2
)
)
,
𝐶
8
(
𝑉
1
)
)
)
𝑠
=
(
∀
,
set
,
set
,
⇒
,
∧
,
beautiful
,
⊆
,
beautiful
)
For our algorithm, we pick the long abstract syntax tree, which may appear surprising. The reason is that shortness is not preserved when applying a simple view: whenever a view maps two different constants to the same constant, the resulting tree is not short anymore. Length, on the other hand, is preserved. The disadvantage that long trees take more time to traverse is outweighed by the advantage that we never have to renormalize the trees.
16.1.2Search
Consider two constants
𝑐
∶
𝐸
and
𝑐
′
∶
𝐸
′
, where
𝐸
and
𝐸
′
are preprocessed into long abstract syntax trees
(
𝑡
,
𝑠
)
and
(
𝑡
′
,
𝑠
′
)
. It is now straightforward to show the following Lemma:
Lemma 16.1:
The assignment
𝑐
↦
𝑐
′
is well-typed in a view
𝜎
if
𝑡
=
𝑡
′
(in which case
𝑠
and
𝑠
′
must have the same length
𝑙
) and
𝜎
also contains
𝑠
𝑖
↦
𝑠
𝑖
′
for
𝑖
=
1
,
...
,
𝑙
.
Proof.The claim is simply the typing-preservation condition of theory morphisms for the special case of simple views; expressed using abstract syntax trees.
Of course, the condition about
𝑠
𝑖
↦
𝑠
𝑖
′
may be redundant if
𝑠
contains duplicates; but because
𝑠
has to be traversed anyway, it is cheap to skip all duplicates. We call the set of assignments
𝑠
𝑖
↦
𝑠
𝑖
′
the prerequisites of
𝑐
↦
𝑐
′
.
Lemma 16.2: Core Algorithm
Consider two constant declarations
𝑐
and
𝑐
′
with
𝑡
=
𝑡
′
in theories
Σ
and
Σ
′
. We define a view by starting with
𝜎
=
{
𝑐
↦
𝑐
′
}
and recursively adding all prerequisites to
𝜎
until
•
either the recursion terminates •
or •
or the recursion requires If the above algorithm succeeds, then
𝜎
is a well-typed partial simple view from
Σ
to
Σ
′
.
Proof.Note that the algorithm constructs a partial view by recursively ensuring that the premises of Theorem 16.1 holds.
Example 16.2:
Consider two constants
𝑐
and
𝑐
′
with types
∀𝑥
∶
set
,
𝑦
∶
set
.
beautiful
(
𝑥
)
∧
𝑦
⊆
𝑥
⇒
beautiful
(
𝑦
)
and
∀𝑥
∶
powerset
,
𝑦
∶
powerset
.
finite
(
𝑥
)
∧
𝑦
⊆
𝑥
⇒
finite
(
𝑦
)
. Their syntax trees are
𝑡
=
𝑡
′
=
𝐶
1
[
𝐶
2
,
𝐶
3
]
(
𝐶
4
(
𝐶
5
(
𝐶
6
(
𝑉
2
)
,
𝐶
7
(
𝑉
1
,
𝑉
2
)
)
,
𝐶
8
(
𝑉
1
)
)
)
𝑠
=
(
∀
,
set
,
set
,
⇒
,
∧
,
beautiful
,
⊆
,
beautiful
)
𝑠
′
=
(
∀
,
powerset
,
powerset
,
⇒
,
∧
,
finite
,
⊆
,
finite
)
Since
𝑡
=
𝑡
′
, we set
𝑐
↦
𝑐
′
and compare
𝑠
with
𝑠
′
, meaning we check (ignoring duplicates) that
∀
↦
∀
,
set
↦
powerset
,
⇒
↦
⇒
,
∧
↦
∧
,
beautiful
↦
finite
and
⊆
↦
⊆
are all valid.
To find all views from
Σ
to
Σ
′
, we first run the core algorithm on every pair of
Σ
-constants and
Σ
′
-constants. This usually does not yield big views yet. For example, consider the typical case where theories contain some symbol declarations and some axioms, in which the symbols occur. Then the core algorithm will only find views that map at most one axiom.
Depending on what we intend to do with the results, we might prefer to consider them individually (e.g. to yield alignments). But we can also use these small views as building blocks to construct larger, possibly total ones:
Lemma 16.3: Amalgamating Views
We call two partial views compatible if they agree on all constants for which both provide an assignment.
The union of compatible well-typed views is again well-typed.
Proof.Note that it is sufficient to show that for any constant
𝑐
∶
𝑇
in its domain, the amalgamated view
𝜎
satisfies
𝜎
(
𝑐
)
⇐
𝜎
(
𝑇
)
. Since the partial views amalgamated to
𝜎
are assumed to be compatible and well-typed, this immediately follows from the assignment
𝑐
↦
𝜎
(
𝑐
)
being in (at least) one of the constituent partial views.
Example 16.3:
Consider the partial view from Example 16.2 and imagine a second partial view for the axioms
beautiful
(
∅
)
and
finite
(
∅
)
. The former has the requirements
∀
↦
∀
,
set
↦
powerset
⇒
↦
⇒
∧
↦
∧
beautiful
↦
finite
⊆
↦
⊆
The latter requires only
set
↦
powerset
and
∅
↦
∅
. Since both views agree on all assignments, we can merge all of them into a single view, mapping both axioms and all requirements of both.
16.1.3Optimizations
The above presentation is intentionally simple to convey the general idea. We now consider a few advanced features of the implementation to enhance scalability.
Caching Preprocessing ResultsBecause the preprocessing performs normalization, it can be time-consuming. Therefore, we allow for storing the preprocessing results to disk and reloading them in a later run.
Fixing the Meta-TheoryWe improve the preprocessing in a way that exploits the common meta-theory, which is meant to be fixed by every view. All we have to do is, when building the abstract syntax trees
(
𝑡
,
𝑠
)
, to retain all references to constants of the meta-theory in
𝑡
instead of replacing them with numbers. With this change,
𝑠
will never contain meta-theory constants, and the core algorithm will only find views that fix all meta-theory constants. Because
𝑠
is much shorter now, the view search is much faster.
It is worth pointing out that the meta-theory is not always as fixed as one might think. Often we want to consider certain constants, that are defined early on in the library and then used widely, to be part of the meta-theory. In PVS (see Chapter 10), this makes sense, e.g., for all operations defined in the Prelude library. For across-library view finding, we might also apply a partial translation (in the sense of Chapter 15) to the target library first and consider all succesfully translated symbols as part of the meta-theory, ensuring that all morphisms found are compatible with and predicated on already known alignments and morphisms.
Note that we still only have to cache one set of preprocessing results for each library: changes to the meta-theory only require minor adjustments to the abstract syntax trees without redoing the entire normalization.
Biasing the Core AlgorithmThe core algorithm starts with an assignment
𝑐
↦
𝑐
′
and then recurses into constant that occur in the declarations of
𝑐
and
𝑐
′
. This occurs-in relation typically splits the constants into layers. A typical theory declares types, which then occur in the declarations of function symbols, which then occur in axioms. Because views that only map type and function symbols are rarely interesting (because they do not allow transporting non-trivial theorems), we always start with assignments where
𝑐
is an axiom, but other conditions for starting declarations are possible.
Exploiting Theory StructureLibraries are usually highly structured using imports between theories. If
Σ
is imported into
Σ
′
, then the set of partial views out of
Σ
′
is a superset of the set of partial views out of
Σ
. If implemented naively, that would yield a quadratic blow-up in the number of views to consider.
Instead, when running our algorithm on an entire library, we only consider views between theories that are not imported into other theories. In an additional postprocessing phase, the domain and codomain of each found partial view
𝜎
are adjusted to the minimal theories that make
𝜎
well-typed.
16.2Implementation
I have implemented our view finder algorithm in the Mmt system and exposed it in Mmt’s jEdit IDE. A screenshot of Jane’s theory of beautiful sets is given in Figure 16.1. Right-clicking anywhere within the theory allows Jane to select
MMT
→
Find
Views
to...
→
MitM/smglom
. The latter menu offers a choice of known libraries in which the view finder should look for codomain theories;
MitM/smglom
is the Math-in-the-Middle library (see Section 3.4).
After choosing
MitM/smglom
, the view finder finds two views (within less than one second) and shows them (Figure 16.2).
The first of these (
View1
) has a theory for matroids as its codomain, which is given in Listing 16.1. Inspecting that theory and the assignments in the view, we see that it indeed represents the well-known correspondence between beautiful sets and matroids.
Listing 16.1: The Theory of Matroids in the MitM Library
1
The latter uses predefined propositions in its axioms and uses a type
coll
for the collection of sets, while the former has the statements of the axioms directly in the theory and uses a predicate
beautiful
– since (typed) sets are defined as predicates, definition expansion is required for matching. Additionally, the implication that beautiful sets (or sets in a matroid) are finite is stated as a logical formula in the former, while the latter uses the Curry/Howard correspondence.
16.3Cross-Library Viewfinding
We now generalize view finding to the situation where domain and codomain live in different libraries written in different logics. Intuitively, the key idea is that we now have two fixed meta-theories
𝑀
and
𝑀
′
and a fixed meta-view
𝑚
∶
𝑀
→
𝑀
′
. However, due to the various idiosyncrasies of logics, tools’ library structuring features and individual library conventions, this problem is significantly more difficult than intra-library view finding. For example, unless the logics are closely related, meta-views usually do not even exist and must be approximated. Therefore, a lot of tweaking is typically necessary, and it is possible that multiple runs with different trade-offs give different interesting results.
As an example, we present a large case study where we find views from the MitM library used in the running example so far into the PVS/NASA library (see Chapter 10).
Theory Structure NormalizationPVS’s complex and prevalently used parametric theories critically affect view finding because they affect the structure of theories. For example, the theory of groups
group
def
in the NASA library has three theory parameters
(
T
,
∗
,
one
)
for the signature of groups, includes the theory
monoid
def
with the same parameters, and then declares the axioms for a group in terms of these parameters. Without special treatment, we could only find views from/into libraries that use the same theory structure.
_
_
We have investigated three approaches of handling parametric theories:
1.
Simple treatment: We drop theory parameters and interpret references to them as free variables that match anything. This is of course not sound so that all found views must be double-checked. However, because practical search problems often do not require exact results, even returning all potential views can be useful. 2.
Covariant elimination: We treat theory parameters as if they were constants declared in the body. In the above mentioned theory _
3.
Contravariant elimination: The theory parameters are treated as if they were bound separately for every constant in the body of the theory. In the above mentioned theory _
_
I have implemented the first two approaches. The first is the most straightforward but it leads to many false positives and false negatives. I have found the second approach to be the most useful for inter-library search since it most closely corresponds to simple formalizations of abstract theories in other libraries. The third approach will be our method of choice when investigating intra-library views of PVS/NASA in future work.
As a first use case, we can write down a theory for a commutative binary operator using the MitM foundation, while targeting the PVS Prelude library – allowing us to find all commutative operators, as in Figure 16.3 (using the simple approach to theory parameters).
This example also hints at a way to iteratively improve the results of the view finder: since we can find properties like commutativity and associativity, we can use the results to in turn inform a better normalization of the theory by exploiting these properties. This in turn would potentially allow for finding more views.
To evaluate the approaches to theory parameters we used a simple theory of monoids in the MitM foundation and the theory of monoids in the NASA library as domains for viewfinding with the whole NASA library as target using simple and covariant approaches. The results are summarized in Figure 16.4.
Most of the results in the simple MitM
→
NASA case are artifacts of the theory parameter treatment and view amalgamation – in fact only two of the 17 results are meaningful (to operations on sets and the theory of number fields). In the covariant case, the additional requirements lead to fuller (one total) and less spurious views. With a theory from the NASA library as domain, the results are already too many to be properly evaluated by hand. With the simple approach to theory parameters, most results can be considered artifacts; in the covariant case, the most promising results yield (partial) views into the theories of semigroups, rings (both the multiplicative and additive parts) and most extensions thereof (due to the duplication of theory parameters as constants).
16.4Applications for Viewfinding
We have seen how a view finder can be used for theory classification and finding constants with specific desired properties, but many other potential use cases are imaginable. The main problems to solve with respect to these is less about the algorithm or software design challenges, but user interfaces.
The theory classification use case described in Section 16.2 is mostly desirable in a setting where a user is actively writing or editing a theory, so the integration in jEdit is sensible. However, the across-library use case in Section 16.3 already would be a lot more useful in a theory exploration setting, such as when browsing available archives on MathHub [Ian+14] or in the graph viewer integrated in Mmt [RKM17]. Additional specialized user interfaces would enable or improve the following use cases:
•
Furthermore, partial views – especially those that are total on some included theory – could be insightful counterexamples.
•
Additionally, surjective partial views would inform her, that her theory would probably better be refactored as an extension of the codomain, which would allow her to use all theorems and definitions therein.
•
•
This would allow for e.g. discovering and importing theorems and useful definitions from some other library – which on the basis of our encodings can be done directly by the view finder.
A useful interface might specifically prioritize views into theories on top of which there are many theorems and definitions that have been discovered.
For some of these use cases it would be advantageous to look for views into our working theory instead.
Note that even though the algorithm is in principle symmetric, some aspects often depend on the direction – e.g. how we preprocess the theories, which constants we use as starting points or how we aggregate and evaluate the resulting (partial) views (see Sections 16.1.3 and 16.3).
Chapter 17
Refactoring and Theory Discovery via Theory Intersections
Disclaimer:
A significant part of the contents of this chapter has been previously published as Work-in-Progress as [MK15] with coauthor Michael Kohlhase. Both the writing as well as the theoretical results were developed in close collaboration between the authors, hence it is impossible to precisely assign authorship to individual authors.
My contribution in this chapter consists of all implementations described herein.
An important driver of mathematical development is the discovery of new mathematical objects, concepts and theories. Even though there are many different situations that give rise to a new theory, it seems that a common instance is the discovery of commonalities between apparently different mathematical structures (if such exist). In fact, many of the algebraic theories in Bourbaki naturally arise as the “common part” of two (or more) different mathematical structures – for instance, one could interpret the theory of groups as the common theory of
(
Z
,
+
,
0
)
and the set of symmetrical operations on e.g. a square.
In [Koh14], Kohlhase proposes a notion of theory intersection – elaborating an earlier formalization from [Nor08] to Mmt [RK13b; HKR12b] that captures this phenomenon in a formal setting: Let two theories
𝑆
and
𝑇
, a partial view
𝜎
∶
𝑆
→
𝑇
with domain
𝐷
and codomain
𝐶
, and its partial inverse
𝛿
(as in Figure 17.1 on the left) be given. Then we can pass to a more modular theory graph, where
𝑆
′
∶=
𝑆/𝐷
and
𝑇
′
∶=
𝑇/𝐶
(as in Figure 17.1 on the right). In this case we think of the isomorphic theories
𝐷
and
𝐶
as the intersection of theories
𝑆
and
𝑇
along
𝜎
and
𝛿
.
17.1Theory Intersection by Example
To fortify our intuitions, we examine a concrete mathematical example in detail.
We start out with a theory
PosPlus
of positive natural numbers with addition and intersect it with
StrConc
of strings with concatenation (as in Figure 17.2). Note that we do not start with modular developments; rather the modular structure is (going to be) the result of intersecting with various examples. Also, note that the views k and l are both partial.
Now the intersection as proposed above yields Figure 17.3, which directly corresponds to the schema in Figure 17.1. Note, that the new pair of (equivalent) theories is completely determined by the partial morphism
k
; the only thing we have to invent are the names – and we have not been very inventive here.
Intuitively, the intersection theory is the theory of semigroups which is traditionally written down in Mmt as in Figure 17.4. And indeed,
sg
is a renaming of both
A
and
B
.
For this situation, we should have a variant theory intersection operator that takes a partial morphism and returns one intersection theory.
In this situation, we want a theory intersection operator that follows the schema in Figure 17.5.
Let us call this operation unary TI and the previous one binary TI. To compute it, we need to specify a name
𝑁
for the new theory and two renamings
𝜌
1
∶
𝑁
→
(
𝜎
)
and
𝜌
2
∶
𝑁
→
(
𝜎
)
. Note that in this TI operator, the intersection is connected to the “rest theories” via Mmt structures – rather than mere inclusions – which carry the assignments induced by the partial morphisms suitably composed with the renamings (see next section).
Dom
Img
In our example we can obtain the theory
sg
from Figure 17.4 via the renamings
𝜌
1
∶=
{
G
↦
N
,
○
↦
+
,
assoc
↦
assoc
}
and
𝜌
2
∶=
{
G
↦
𝐴
∗
,
○
↦
∶∶
,
assoc
↦
assoc
}
.
Unary TI is often the more useful operation on theory graphs, but needs direct user supervision, whereas binary TI is fully automatic if we accept generated names for the intersection theories.
17.2Reducing Partial Morphisms to Renamings
In the previous example it is noteworthy, that the morphisms k and l are fully inverse to each other. In fact, they are renamings. By a renaming, we mean a partial morphism
𝜎
∶
𝑆
→
𝑇
such, that for every constant
𝑐
declared in
𝑆
,
𝜎
(
𝑐
)
is again a constant (as opposed to a complex term).
Naturally, theory intersections are a lot simpler for renamings. In fact, we only need a single renaming
𝜎
∶
𝑆
→
𝑇
to intersect along (since the inverse of a renaming is again a renaming and uniquely determined). It turns out, that we can always reduce the situation in Figure 17.1 to the much easier case, where we have a single renaming
𝜎
∶
𝑆
′
→
𝑇
′
, where
𝑆
′
and
𝑇
′
are conservative extensions of our original theories:
Let
𝛿
∶
𝑆
→
𝑇
be a partial theory morphism and
𝛿
(
𝑐
)
=
𝑡
for a complex term
𝑡
. We can then conservatively extend
𝑇
to a theory
𝑇
1
that contains a new defined constant
𝑐
𝛿
=
𝑡
and adapt
𝛿
to a new morphism
𝛿
1
with
𝛿
1
(
𝑐
)
=
𝑐
𝛿
(and
𝛿
1
↾
𝑆
∖
{
𝑐
}
=
𝛿
↾
𝑆
∖
{
𝑐
}
), as in Figure 17.6. If we repeat this process for every mapping
𝛿
(
𝑐
)
=
𝑡
that is not already a simple renaming, we yield a new conservative extension
𝑇
′
such, that the corresponding morphism
𝛿
′
(accordingly adapted) is a renaming.
Doing the same with a partial inverse
𝜌
∶
𝑇
→
𝑆
, we ultimately get a single renaming
𝜎
∶
𝑆
′
→
𝑇
′
that is equivalent to the original pair of partial morphisms. We can therefore w.l.o.g. reduce ourselves in every instance to this case.
17.3Implementation
I have implemented a method for unary TI in an earlier version of the Mmt API.
The intersection method takes as parameters two theories, a name for the intersection and a list of pairs of declarations, which can be obtained from a pair of morphisms, a single morphism or use the view finder presented in Chapter 16. It returns the intersection theory and the refactored versions of (conservative extensions of) the original theories, depending on the original morphisms provided.
Both methods are integrated into a refactoring panel, which is part of the Mmt plugin for jEdit. To intersect two theories, the user can either provide one or two morphisms between the theories or allow the Viewfinder to pick for them. The declarations in the intersection theory can optionally be named.
An annotated video demonstration of the refactoring panel and its components can be found on Youtube [IntV15].
If we accept automatically generated names for the theory intersections, we can eliminate user interaction and use the view finder to automate the intersection operation on a set of theories without having to provide the morphisms to intersect along beforehand. This could be used to refactor whole libraries in a more modular way (in concordance with the little theories approach [FGT92b]). Additionally, by supplying the view finder with alignments and other translators between libraries, our approach could be used to refactor a whole library along the theory graph of an already modular library.
We do not yet have a heuristic to evaluate (the resulting) theories automatically with respect to their interest; however, given a morphism between two interesting theories, we have observed that the corresponding intersection tends to be interesting as well, as long as the morphism is not meaningless itself.
Figure 17.5: Unary TI
...
Figure 17.6: Obtaining a Renaming From a Morphism
Part V
Conclusion and Future Work
We have seen how we can take on the integration problem for formal libraries by representing them in a unifying meta-framework (Part III), using a flexible logical framework to specify their foundations (Part II), aligning their contents (Chapter 14) and implementing foundation-independent alignment-aware algorithms for identifying overlap (Chapter 16) suggestive of new alignments, translating expressions across libraries (Chapter 15) and modularize library content by refactoring along theory intersections (Chapter 17).
The knowledge management techniques described in Part IV are certainly not exhaustive; many more services are imaginable, but the ones presented in this thesis demonstrate the effectiveness of the general approach to library integration described herein. Furthermore, these facilitate the aims of the OAF (see Section 3.5) and OpenDreamKit (see Section 3.3 ) projects, as demonstrated in Section 15.3. Additionally they represent steps towards a system integrating all aspects of the tetrapod (see Chapter 1); demonstrably inference, computation, tabulation and organization.
Naturally, this opens up many avenues for future research and improvements, which fall broadly into three categories:
1. FundamentalsMany implementations described in this thesis are prototypical and can be optimized or otherwise improved. In particular:
1.
2.
3.
4.
To check a judgement
𝑡
⇐
𝑇
we have two strategies: We can either simplify
𝑇
until we find an applicable foundation-dependent typing rule specifically for (the head symbol of)
𝑇
, or we can try to infer the principal type
𝑇
′
of
𝑡
and check
𝑇
′
<∶
𝑇
. In most cases, the existence of a typing rule implies that the former strategy will quickly yield a result, whereas inferring the type can be expensive, and hence the former strategy is attempted first. However, imagine
𝑇
being a single reference to a constant with a complex definiens, and
𝑡
being a function application
𝑓
(
𝑎
)
, where the return type of
𝑓
is
𝑇
. In this case, inferring the principal type of
𝑡
is fast and the subsequent check
𝑇
<∶
𝑇
trivial, whereas simplifying
𝑇
leads to definition expansion and subsequent further simplifications, which will ultimately need to be repeated for the inferred type of
𝑡
as well, since a function application can only ever check against any type via the inference strategy. This kind of situation seems to occur regularly in high-level formalizations, and is correspondingly often encountered in the development of the Math-in-the-Middle library.
To remedy this problem, I am currently actively working on a reimplementation of the Mmt solver algorithm using multithreading; not just to exploit multicore processors, but to be able to compute conflicting checking strategies on separate threads in parallel, such that if either strategy terminates quickly, the checking algorithm can return a result just as quickly. This would obviate the current advantage of being cautious with adding potentially slow inference rules.
2. Scaling UpMany of the implementations and approaches presented in this thesis scale in usefulness and coverage depending on a certain amount of available data, features and implementations. All of these can and should be scaled up in the future, such as:
1.
2.
3.
4.
5.
6.
3. WorkflowsOne of the major obstacles in using all of the techniques in this thesis is a certain lack of tool support with respect to specific workflows for various use cases. Especially the algorithms presented in Part IV are primarily usable by programmatically calling methods in the Mmt API, or via commands in the Mmt shell. Exemplary, in Chapter 16 my coauthors and I (on the original publication) already conjecture on possible applications for the view finder algorithm, most of which are predicated on the existence of a bespoke user interface for that particular use case. Designing such interfaces entails thinking deeply about realistic and convenient workflows for different situations and users.
This also applies to Mmt’s surface syntax, which requires a very reasonable, but (for example, for working mathematicians not familiar with formal systems) potentially prohibitive amount of time and effort to get accustomed to.
I have written an Mmt plugin for IntelliJ IDEA
, a java-based IDE which offers a rich plugin API, to experiment with integrating various generic services in a more flexible IDE than the only previously supported one, jEdit. In the future, I would like to investigate the possibility of using e.g. sT
1
E
X [Koh08] or possibly even plain L
A
T
E
X as an alternative input language for formal Mmt content, as well as designing a unifying online IDE on MathHub for working with Mmt and its connected systems and libraries, obviating the need to set up and use Mmt (and the multitude of systems with which Mmt can interact) locally – a process which (even using docker images, jupyter notebooks and similar solutions) can be prohibitively complicated for non-experts, especially when lacking a unifying user interface. 1
Chapter 18
Bibliography
[LFX]
MathHub MMT/LFX Git Repository. url: [Mita]
MitM/Foundation. url: [Mitb]
MitM/Interfaces. url: [Mitc]
MitM/smglom. url: [MMT]
UniFormal/MMT – The MMT Language and System. url: [OMU]
OMDoc/MMT Urtheories. url: [PRA]
Public Repository for Alignments. [Con+]
A. Condoluci, M. Kohlhase, D. Müller, F. Rabe, C. S. Coen, and M. Wenzel. “Relational Data Across Mathematical Libraries”. Accepted for CICM 2019. url: [Deh+16]
P.-O. Dehaye et al.. “Interoperability in the OpenDreamKit Project: The Math-in-the-Middle Approach”. in: Intelligent Computer Mathematics 2016. Conferences on Intelligent Computer Mathematics (Bialystok, Poland, July 25–29, 2016). ed. by M. Kohlhase, M. Johansson, B. Miller, L. de Moura, and F. Tompa. LNAI 9791. Springer, 2016. url: [IntV15]
D. Müller. Theory Intersections in MMT. url: [Kal+16]
C. Kaliszyk, M. Kohlhase, D. Müller, and F. Rabe. “A Standard for Aligning Mathematical Concepts”. in: Work in Progress at CICM 2016. ed. by A. Kohlhase, M. Kohlhase, P. Libbrecht, B. Miller, F. Tompa, A. Naummowicz, W. Neuper, P. Quaresma, and M. Suda. CEUR-WS.org, 2016, 229–244.[Koh+17a]
M. Kohlhase, D. Müller, M. Pfeiffer, F. Rabe, N. Thiéry, V. Vasilyev, and T. Wiesing. “Knowledge-Based Interoperability for Mathematical Software Systems”. in: Mathematical Aspects of Computer and Information Sciences. ed. by J. Blömer, I. Kotsireas, T. Kutsia, and D. Simos. Springer, 2017, 195–210.[Koh+17b]
M. Kohlhase, T. Koprucki, D. Müller, and K. Tabelow. “Mathematical models as research data via flexiformal theory graphs”. in: Intelligent Computer Mathematics (CICM) 2017. Conferences on Intelligent Computer Mathematics. ed. by H. Geuvers, M. England, O. Hasan, F. Rabe, and O. Teschke. LNAI 10383. Springer, 2017. doi: [Koh+17c]
M. Kohlhase, D. Müller, S. Owre, and F. Rabe. “Making PVS Accessible to Generic Services by Interpretation in a Universal Format”. in: Interactive Theorem Proving. ed. by M. Ayala-Rincón and C. A. Muñoz. vol. 10499. LNCS. Springer, 2017. url: [Kop+18]
T. Koprucki, M. Kohlhase, K. Tabelow, D. Müller, and F. Rabe. “Model pathway diagrams for the representation of mathematical models”. in: Journal of Optical and Quantum Electronics 50.2 (2018), 70. doi: [MK15]
D. Müller and M. Kohlhase. “Understanding Mathematical Theory Formation via Theory Intersections in MMT”. 2015. url: [MKR18]
D. Müller, M. Kohlhase, and F. Rabe. “Automatically Finding Theory Morphisms for Knowledge Management”. in: Intelligent Computer Mathematics (CICM) 2018. Conferences on Intelligent Computer Mathematics. ed. by F. Rabe, W. Farmer, A. Youssef, and .... LNAI. in press. Springer, 2018. url: [MR19]
D. Müller and F. Rabe. “Rapid Prototyping Formal Systems in MMT: 5 Case Studies”. Accepted for LFMTP 2019. 2019. url: [MRK]
D. Müller, F. Rabe, and M. Kohlhase. Theories as Types. url: [MRK18]
D. Müller, F. Rabe, and M. Kohlhase. “Theories as Types”. in: ed. by D. Galmiche, S. Schulz, and R. Sebastiani. Springer Verlag, 2018. url: [MRS]
D. Müller, F. Rabe, and C. Sacerdoti Coen. “The Coq Library as a Theory Graph”. Submitted to CICM 2019. url: [Mül+17a]
D. Müller, C. Rothgang, Y. Liu, and F. Rabe. “Alignment-based Translations Across Formal Systems Using Interface Theories”. in: Proof eXchange for Theorem Proving. ed. by C. Dubois and B. Woltzenlogel Paleo. Open Publishing Association, 2017, 77–93.[Mül+17b]
D. Müller, T. Gauthier, C. Kaliszyk, M. Kohlhase, and F. Rabe. “Classification of Alignments between Concepts of Formal Mathematical Systems”. in: Intelligent Computer Mathematics (CICM) 2017. Conferences on Intelligent Computer Mathematics. ed. by H. Geuvers, M. England, O. Hasan, F. Rabe, and O. Teschke. LNAI 10383. Springer, 2017. doi: [Mül+17c]
D. Müller, T. Gauthier, C. Kaliszyk, M. Kohlhase, and F. Rabe. Classification of Alignments between Concepts of Formal Mathematical Systems. tech. rep.. 2017. url: [RKM16]
D. Rochau, M. Kohlhase, and D. Müller. “FrameIT Reloaded: Serious Math Games from Modular Math Ontologies”. in: Intelligent Computer Mathematics – Work in Progress Papers. ed. by M. Kohlhase, A. Kohlhase, P. Libbrecht, B. Miller, A. Naumowicz, W. Neuper, P. Quaresma, F. Tompa, and M. Suda. 2016. url: [RKM17]
M. Rupprecht, M. Kohlhase, and D. Müller. “A Flexible, Interactive Theory-Graph Viewer”. in: MathUI 2017: The 12th Workshop on Mathematical User Interfaces. ed. by A. Kohlhase and M. Pollanen. 2017. url: [RM18]
F. Rabe and D. Müller. “Structuring Theories with Implicit Morphisms”. in: 24th International Workshop on Algebraic Development Techniques 2018. 2018. url: [AFP]
AFP. Archive of Formal Proofs. url: [Art]
R. Arthan. ProofPower. [Asp+06a]
A. Asperti, C. S. Coen, E. Tassi, and S. Zacchiroli. “Crafting a Proof Assistant”. in: TYPES. ed. by T. Altenkirch and C. McBride. Springer, 2006, 18–32.[Aus+03]
R. Ausbrooks et al.. “Mathematical Markup Language (MathML) v. 2.0.”. in: World Wide Web Consortium recommendation (2003).[BCH12]
M. Boespflug, Q. Carbonneaux, and O. Hermant. “The [Bra13]
E. Brady. “Idris, a general-purpose dependently typed programming language: Design and implementation”. in: Journal of Functional Programming 23.5 (2013), 552–593.[Bus+04]
S. Buswell, O. Caprotti, D. P. Carlisle, M. C. Dewar, M. Gaetano, and M. Kohlhase. The Open Math standard. tech. rep.. version 2.0. The Open Math Society, 2004.[CC]
CoCalc: Collaborative Calculation in the Cloud. url: [CFO10]
J. Carette, W. Farmer, and R. O’Connor. The MathScheme Project. [Con+86]
R. Constable et al.. Implementing Mathematics with the Nuprl Development System. Prentice-Hall, 1986.[Coq15]
Coq Development Team. The Coq Proof Assistant: Reference Manual. tech. rep.. INRIA, 2015.[Dev16]
T. S. Developers. SageMath, the Sage Mathematics Software System (Version 7.0). 2016. url: [DT]
G. Dowek and F. Thiré. “Logipedia: a multi-system encyclopedia of formal proofs”. url: [Dun+15]
C. Dunchev, F. Guidi, C. S. Coen, and E. Tassi. “ELPI: Fast, Embeddable, lambda-Prolog Interpreter”. in: Logic for Programming, Artificial Intelligence, and Reasoning. ed. by M. Davis, A. Fehnker, A. McIver, and A. Voronkov. 2015, 460–468.[FGT]
W. M. Farmer, J. Guttman, and J. Thayer. The IMPS online theory library. url: [FGT93]
W. Farmer, J. Guttman, and F. Thayer. “IMPS: An Interactive Mathematical Proof System”. in: Journal of Automated Reasoning 11.2 (1993), 213–248.[Gap]
GAP – Groups, Algorithms, and Programming, Version 4.8.2. The GAP Group. 2016. url: [GC14]
D. Ginev and J. Corneli. “NNexus Reloaded”. in: Intelligent Computer Mathematics 2014. Conferences on Intelligent Computer Mathematics (Coimbra, Portugal, July 7–11, 2014). ed. by S. Watt, J. Davenport, A. Sexton, P. Sojka, and J. Urban. LNCS 8543. Springer, 2014, 423–426. url: [Gog+93]
J. Goguen, T. Winkler, J. Meseguer, K. Futatsugi, and J. Jouannaud. “Introducing OBJ”. in: Applications of Algebraic Specification using OBJ. ed. by J. Goguen, D. Coleman, and R. Gallimore. Cambridge, 1993.[Gro16]
T. G. Group. GAP – Groups, Algorithms, and Programming. [Har+12]
T. Hardin et al.. The FoCaLiZe Essential. [Har96]
J. Harrison. “HOL Light: A Tutorial Introduction”. in: Proceedings of the First International Conference on Formal Methods in Computer-Aided Design. Springer, 1996, 265–269.[HHP93a]
R. Harper, F. Honsell, and G. Plotkin. “A framework for defining logics”. in: Journal of the Association for Computing Machinery 40.1 (1993), 143–184.[HOL4]
The HOL4 development team. HOL4. url: [Hur09]
J. Hurd. “OpenTheory: Package Management for Higher Order Logic Theories”. in: Programming Languages for Mechanized Mathematics Systems. ed. by G. D. Reis and L. Théry. ACM, 2009, 31–37.[Isa]
Isabelle. Mar. 9, 2013. url: [Jup]
Project Jupyter. url: [KMM00]
M. Kaufmann, P. Manolios, and J Moore. Computer-Aided Reasoning: An Approach. Kluwer Academic Publishers, 2000.[LMF]
T. LMFDB Collaboration. The L-functions and Modular Forms Database. [Lmfa]
LMFDB Knowledge Database. url: [Lmfb]
LMFDB Knowledge Database entry for Minimal Weierstrass equation over the rationals. url: [LN12]
F. Lübeck and M. Neunhöffer. GAPDoc, A Meta Package for GAP Documentation, Version 1.5.1. 2012. url: [Mat]
Mathematical Components. url: [MC]
Mathematical Components. url: [MeMa]
Metamath Home page. url: [Mil+97]
R. Milner, M. Tofte, R. Harper, and D. MacQueen. The Definition of Standard ML, Revised edition. MIT Press, 1997.[Miz]
Mizar. [MizLib]
Mizar Mathematical Library. url: [MN86]
D. Miller and G. Nadathur. “Higher-order logic programming”. in: Proceedings of the Third International Conference on Logic Programming. ed. by E. Shapiro. Springer, 1986, 448–462.[Nor05]
U. Norell. The Agda WiKi. [NPW02]
T. Nipkow, L. Paulson, and M. Wenzel. Isabelle/HOL — A Proof Assistant for Higher-Order Logic. Springer, 2002.[OAF]
The OAF Project & System. url: [ODK]
OpenDreamKit Open Digital Research Environment Toolkit for the Advancement of Mathematics. url: [ORS92]
S. Owre, J. Rushby, and N. Shankar. “PVS: A Prototype Verification System”. in: 11th International Conference on Automated Deduction (CADE). ed. by D. Kapur. Springer, 1992, 748–752.[PC93]
L. Paulson and M. Coen. Zermelo-Fraenkel Set Theory. Isabelle distribution, ZF/ZF.thy. 1993.[PS99]
F. Pfenning and C. Schürmann. “System Description: Twelf - A Meta-Logical Framework for Deductive Systems”. in: Automated Deduction. ed. by H. Ganzinger. 1999, 202–206.[PVS]
NASA PVS Library. url: [Qed]
The QED Project. [Ran04]
A. Ranta. “Grammatical Framework — A Type-Theoretical Grammar Formalism”. in: Journal of Functional Programming 14.2 (2004), 145–189.[RK13a]
F. Rabe and M. Kohlhase. “A Scalable Module System”. in: Information and Computation 230.1 (2013), 1–54.[SC]
N. M. T. et al.. Elements, parents, and categories in Sage: a primer. url: [SNG]
Singular. url: [Sut09]
G. Sutcliffe. “The TPTP Problem Library and Associated Infrastructure: The FOF and CNF Parts, v3.5.0”. in: Journal of Automated Reasoning 43.4 (2009), 337–362.[SW83]
D. Sannella and M. Wirsing. “A Kernel Language for Algebraic Specification and Implementation”. in: Fundamentals of Computation Theory. ed. by M. Karpinski. Springer, 1983, 413–427.[Tea03]
T. C. D. Team. The Coq proof assistant reference manual (version 7.4). tech. rep.. INRIA, Rocquencourt, France, 2003.[Wen09]
M. Wenzel. The Isabelle/Isar Reference Manual. [WPN08]
M. Wenzel, L. C. Paulson, and T. Nipkow. “The Isabelle Framework”. in: Theorem Proving in Higher Order Logics (TPHOLs 2008). ed. by A. Mohamed, Munoz, and Tahar. LNCS 5170. Springer, 2008, 33–38.[AH15]
S. Autexier and D. Hutter. “Structure Formation in Large Theories”. in: Intelligent Computer Mathematics 2015. Conferences on Intelligent Computer Mathematics (Washington DC, USA, July 13–17, 2015). ed. by M. Kerber, J. Carette, C. Kaliszyk, F. Rabe, and V. Sorge. LNCS 9150. Springer, 2015, 155–170.[Arx]
arXiv.org e-print archive. [Asp+06b]
A. Asperti, F. Guidi, C. S. Coen, E. Tassi, and S. Zacchiroli. “A Content Based Mathematical Search Engine: Whelp”. in: Types for Proofs and Programs, International Workshop, TYPES 2004, revised selected papers. ed. by J.-C. Filliâtre, C. Paulin-Mohring, and B. Werner. LNCS 3839. Springer Verlag, 2006, 17–32.[Bet18]
J. Betzendahl. “Translating the IMPS Theory Library to MMT / OMDoc”. Master’s Thesis. Informatik, Universität Bielefeld, Apr. 2018. url: [BKR]
K. Berčič, M. Kohlhase, and F. Rabe. “Towards a Unified Mathematical Data Infrastructure: Database and Interface Generation”. submitted to CICM 2019. url: [BL]
T. Breuer and S. Linton. “The GAP 4 Type System: Organising Algebraic Algorithms”. in: Proceedings of the 1998 International Symposium on Symbolic and Algebraic Computation. ISSAC ’98. ACM, 38–45.[Bla+14]
J. C. Blanchette et al.. “Truly Modular (Co)datatypes for Isabelle/HOL”. in: ITP. ed. by G. Klein and R. Gamboa. vol. 8558. LNCS. Springer, 2014, 93–110. doi: [Bob+11]
F. Bobot, J. Filliâtre, C. Marché, and A. Paskevich. “Why3: Shepherd Your Herd of Provers”. in: Boogie 2011: First International Workshop on Intermediate Verification Languages. 2011, 53–64.[Bou64]
N. Bourbaki. “Univers”. in: Séminaire de Géométrie Algébrique du Bois Marie - Théorie des topos et cohomologie étale des schémas. Springer, 1964, 185–217.[Car+19]
J. Carette, W. M. Farmer, M. Kohlhase, and F. Rabe. “Big Math and the One-Brain Barrier – A Position Paper and Architecture Proposal”. submitted to Mathematical Intelligencer. 2019. url: [CF58]
H. Curry and R. Feys. Combinatory Logic. Amsterdam: North-Holland, 1958.[CH88]
T. Coquand and G. Huet. “The Calculus of Constructions”. in: Information and Computation 76.2/3 (1988), 95–120.[Chu40]
A. Church. “A Formulation of the Simple Theory of Types”. in: Journal of Symbolic Logic 5.1 (1940), 56–68.[Cod+11]
M. Codescu, F. Horozal, M. Kohlhase, T. Mossakowski, and F. Rabe. “Project Abstract: Logic Atlas and Integrator (LATIN)”. in: Intelligent Computer Mathematics. ed. by J. Davenport, W. Farmer, F. Rabe, and J. Urban. Springer, 2011, 289–291.[Dyb97]
P. Dybjer. “Representing Inductively Defined Sets by Wellorderings in Martin-LÖF’s Type Theory”. in: Theor. Comput. Sci. 176.1-2 (Apr. 1997), 329–335. doi: [EI]
EINFRA-9: e-Infrastructure for Virtual Research Environment. url: [ESC07]
J. Euzenat, P. Shvaiko, and E. Corporation. Ontology matching. Springer, 2007.[FGT92a]
W. Farmer, J. Guttman, and F. Thayer. “Little Theories”. in: Conference on Automated Deduction. ed. by D. Kapur. 1992, 467–581.[FGT92b]
W. M. Farmer, J. Guttman, and J. Thayer. “Little Theories”. in: Proceedings of the 11th
[Geu+17]
H. Geuvers, M. England, O. Hasan, F. Rabe, and O. Teschke, eds.. Intelligent Computer Mathematics. Conferences on Intelligent Computer Mathematics. LNAI 10383. Springer, 2017. doi: [Gin+16]
D. Ginev et al.. “The SMGloM Project and System. Towards a Terminology and Ontology for Mathematics”. in: Mathematical Software - ICMS 2016 - 5th International Congress. ed. by G.-M. Greuel, T. Koch, P. Paule, and A. Sommese. vol. 9725. LNCS. Springer, 2016. doi: [GK14a]
T. Gauthier and C. Kaliszyk. “Matching concepts across HOL libraries”. in: Intelligent Computer Mathematics. ed. by S. Watt, J. Davenport, A. Sexton, P. Sojka, and J. Urban. Springer, 2014, 267–281.[GK14b]
T. Gauthier and C. Kaliszyk. “Matching concepts across HOL libraries”. in: CICM. ed. by S. Watt, J. Davenport, A. Sexton, P. Sojka, and J. Urban. vol. 8543. LNCS. Springer Verlag, 2014, 267–281. doi: [GK15]
T. Gauthier and C. Kaliszyk. “Sharing HOL4 and HOL Light Proof Knowledge”. in: LPAR. ed. by M. Davis, A. Fehnker, A. McIver, and A. Voronkov. vol. 9450. LNCS. Springer, 2015, 372–386. doi: [GKU16]
T. Gauthier, C. Kaliszyk, and J. Urban. “Initial Experiments with Statistical Conjecturing over Large Formal Corpora”. in: Work in Progress at CICM 2016. ed. by A. Kohlhase et al.. vol. 1785. CEUR. CEUR-WS.org, 2016, 219–228.[Gon+13]
G. Gonthier et al.. “A Machine-Checked Proof of the Odd Order Theorem”. in: Interactive Theorem Proving. ed. by S. Blazy, C. Paulin-Mohring, and D. Pichardie. 2013, 163–179.[Hal+15a]
T. Hales et al.. A formal proof of the Kepler conjecture. 2015. url: [Hal+15b]
T. C. Hales et al.. “A formal proof of the Kepler conjecture”. in: CoRR abs/1501.02155 (2015). url: [HHP93b]
R. Harper, F. Honsell, and G. Plotkin. “A framework for defining logics”. in: Journal of the Association for Computing Machinery 40.1 (1993), 143–184.[HKR12a]
F. Horozal, M. Kohlhase, and F. Rabe. “Extending MKM Formats at the Statement Level”. in: Intelligent Computer Mathematics. ed. by J. Campbell, J. Carette, G. Dos Reis, J. Jeuring, P. Sojka, V. Sorge, and M. Wenzel. Springer, 2012, 64–79.[HKR12b]
F. Horozal, M. Kohlhase, and F. Rabe. “Extending MKM Formats at the Statement Level”. in: Intelligent Computer Mathematics. Conferences on Intelligent Computer Mathematics (CICM) (Bremen, Germany, July 9–14, 2012). ed. by J. Jeuring, J. A. Campbell, J. Carette, G. Dos Reis, P. Sojka, M. Wenzel, and V. Sorge. LNAI 7362. Berlin and Heidelberg: Springer Verlag, 2012, 65–80. url: [HKR14]
F. Horozal, M. Kohlhase, and F. Rabe. “Flexary Operators for Formalized Mathematics”. in: Intelligent Computer Mathematics 2014. Conferences on Intelligent Computer Mathematics (Coimbra, Portugal, July 7–11, 2014). ed. by S. Watt, J. Davenport, A. Sexton, P. Sojka, and J. Urban. LNCS 8543. Springer, 2014, 312–327. url: [Hor14]
F. Horozal. “A Framework for Defining Declarative Languages”. PhD thesis. Jacobs University Bremen, Nov. 2014. url: [How80]
W. Howard. “The formulas-as-types notion of construction”. in: To H.B. Curry: Essays on Combinatory Logic, Lambda-Calculus and Formalism. Academic Press, 1980, 479–490.[HS02]
M. Hofmann and T. Streicher. “The Groupoid Interpretation of Type Theory”. in: (Apr. 2002).[Ian+13]
M. Iancu, M. Kohlhase, F. Rabe, and J. Urban. “The Mizar Mathematical Library in OMDoc: Translation and Applications”. in: Journal of Automated Reasoning 50.2 (2013), 191–202.[Ian+14]
M. Iancu, C. Jucovschi, M. Kohlhase, and T. Wiesing. “System Description: MathHub.info”. in: Intelligent Computer Mathematics. ed. by S. Watt, J. Davenport, A. Sexton, P. Sojka, and J. Urban. Springer, 2014, 431–434.[Ian17]
M. Iancu. “Towards Flexiformal Mathematics”. PhD thesis. Bremen, Germany: Jacobs University, 2017. url: [Jeu+12]
J. Jeuring, J. A. Campbell, J. Carette, G. Dos Reis, P. Sojka, M. Wenzel, and V. Sorge, eds.. Intelligent Computer Mathematics. Conferences on Intelligent Computer Mathematics (CICM) (Bremen, Germany, July 9–14, 2012). LNAI 7362. Berlin and Heidelberg: Springer Verlag, 2012.[KK13a]
C. Kaliszyk and A. Krauss. “Scalable LCF-style proof translation”. in: Interactive Theorem Proving. ed. by S. Blazy, C. Paulin-Mohring, and D. Pichardie. Springer, 2013, 51–66.[KK13b]
C. Kaliszyk and A. Krauss. “Scalable LCF-style proof translation”. in: ITP. ed. by S. Blazy, C. Paulin-Mohring, and D. Pichardie. vol. 7998. LNCS. Springer Verlag, 2013, 51–66.[Koh08]
M. Kohlhase. “Using L
A
T
E
X as a Semantic Markup Format”. in: Mathematics in Computer Science 2.2 (2008), 279–304.[Koh13a]
M. Kohlhase. “The Flexiformalist Manifesto”. in: 14th International Workshop on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC 2012). ed. by A. Voronkov, V. Negru, T. Ida, T. Jebelean, D. Petcu, S. M. Watt, and D. Zaharie. Timisoara, Romania: IEEE Press, 2013, 30–36. url: [Koh13b]
M. Kohlhase. “The OMDoc2 Language Design”. KWARC Blue Note. 2013. url: [Koh14]
M. Kohlhase. “Mathematical Knowledge Management: Transcending the One-Brain-Barrier with Theory Graphs”. in: EMS Newsletter (June 2014), 22–27. url: [KP93]
M. Kohlhase and F. Pfenning. “Unification in a [KR14]
C. Kaliszyk and F. Rabe. “Towards Knowledge Management for HOL Light”. in: Intelligent Computer Mathematics. ed. by S. Watt, J. Davenport, A. Sexton, P. Sojka, and J. Urban. Springer, 2014, 357–372.[KR16a]
M. Kohlhase and F. Rabe. “QED Reloaded: Towards a Pluralistic Formal Library of Mathematical Knowledge”. in: Journal of Formalized Reasoning 9.1 (2016), 201–234.[KR16b]
M. Kohlhase and F. Rabe. “QED Reloaded: Towards a Pluralistic Formal Library of Mathematical Knowledge”. in: Journal of Formalized Reasoning 9.1 (2016), 201–234. url: [KRSC11]
M. Kohlhase, F. Rabe, and C. Sacerdoti Coen. “A Foundational View on Integration Problems”. in: Intelligent Computer Mathematics. ed. by J. Davenport, W. Farmer, F. Rabe, and J. Urban. LNAI 6824. Springer Verlag, 2011, 107–122. url: [KŞ06]
M. Kohlhase and I. Şucan. “A Search Engine for Mathematical Formulae”. in: Artificial Intelligence and Symbolic Computation. ed. by T. Ida, J. Calmet, and D. Wang. Springer, 2006, 241–253.[KS10]
A. Krauss and A. Schropp. “A Mechanized Translation from Higher-Order Logic to Set Theory”. in: Interactive Theorem Proving. ed. by M. Kaufmann and L. Paulson. Springer, 2010, 323–338.[KU15]
C. Kaliszyk and J. Urban. “HOL(y)Hammer: Online ATP Service for HOL Light”. in: Mathematics in Computer Science 9.1 (2015), 5–22.[KW03]
H. Kanayama and H. Watanabe. “Multilingual translation via annotated hub language”. in: MT-Summit IX. 2003, 202–207.[KW10]
C. Keller and B. Werner. “Importing HOL Light into Coq”. in: Interactive Theorem Proving. ed. by M. Kaufmann and L. Paulson. Springer, 2010, 307–322.[Luo09]
Z. Luo. “Manifest Fields and Module Mechanisms in Intensional Type Theory”. in: Types for Proofs and Programs. ed. by S. Berardi, F. Damiani, and U. de’Liguoro. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009, 237–255.[MH]
MathHub.info: Active Mathematics. url: [ML74]
P. Martin-Löf. “An Intuitionistic Theory of Types: Predicative Part”. in: Proceedings of the ’73 Logic Colloquium. North-Holland, 1974, 73–118.[ML94]
P. Martin-Löf. Intuitionistic Type Theory. Bibliopolis, 1994.[MWP]
Matroid — Wikipedia, The Free Encyclopedia. url: [NK07]
I. Normann and M. Kohlhase. “Extended Formula Normalization for [Nor08]
I. Normann. “Automated Theory Interpretation”. PhD thesis. Bremen, Germany: Jacobs University, 2008. url: [NSM01]
P. Naumov, M. Stehr, and J. Meseguer. “The HOL/NuPRL proof translator - a practical approach to formal interoperability”. in: 14th International Conference on Theorem Proving in Higher Order Logics. ed. by R. Boulton and P. Jackson. Springer, 2001.[OS06a]
S. Obua and S. Skalberg. “Importing HOL into Isabelle/HOL”. in: Automated Reasoning. ed. by N. Shankar and U. Furbach. vol. 4130. Springer, 2006.[OS06b]
S. Obua and S. Skalberg. “Importing HOL into Isabelle/HOL”. in: Automated Reasoning: Third International Joint Conference, IJCAR 2006, Seattle, WA, USA, August 17-20, 2006. Proceedings. ed. by U. Furbach and N. Shankar. Berlin, Heidelberg: Springer Berlin Heidelberg, 2006, 298–302. doi: [Pfe01]
F. Pfenning. “Logical frameworks”. in: Handbook of automated reasoning. ed. by J. Robinson and A. Voronkov. Elsevier, 2001, 1063–1147.[Pfe+03]
F. Pfenning, C. Schürmann, M. Kohlhase, N. Shankar, and S. Owre.. The Logosphere Project. [Pfe93]
F. Pfenning. “Refinement Types for Logical Frameworks”. in: Informal Proceedings of the 1993 Workshop on Types for Proofs and Programs. ed. by H. Geuvers. Nijmegen, The Netherlands: University of Nijmegen, May 1993, 285–301.[Rab12]
F. Rabe. “A Query Language for Formal Mathematical Libraries”. in: Intelligent Computer Mathematics. Conferences on Intelligent Computer Mathematics (CICM) (Bremen, Germany, July 9–14, 2012). ed. by J. Jeuring, J. A. Campbell, J. Carette, G. Dos Reis, P. Sojka, M. Wenzel, and V. Sorge. LNAI 7362. Berlin and Heidelberg: Springer Verlag, 2012, 142–157. arXiv: [Rab17a]
F. Rabe. “A Modular Type Reconstruction Algorithm”. in: ACM Transactions on Computational Logic (2017). accepted pending minor revision; see [Rab17b]
F. Rabe. “How to Identify, Translate, and Combine Logics?”. in: Journal of Logic and Computation 27.6 (2017), 1753–1798.[RK13b]
F. Rabe and M. Kohlhase. “A Scalable Module System”. in: Information & Computation 0.230 (2013), 1–54. url: [RKS11]
F. Rabe, M. Kohlhase, and C. Sacerdoti Coen. “A Foundational View on Integration Problems”. in: Intelligent Computer Mathematics. ed. by J. Davenport, W. Farmer, F. Rabe, and J. Urban. Springer, 2011, 107–122.[Sol95]
R. Solomon. “On Finite Simple Groups and Their Classification”. in: Notices of the AMS (Feb. 1995), 231–239.[SS89]
M. Schmidt-Schauß. Computational Aspects of an Order-Sorted Logic with Term Declarations. LNAI 395. Springer Verlag, 1989.[Sto+17]
C. Stolze, L. Liquori, F. Honsell, and I. Scagnetto. “Towards a Logical Framework with Intersection and Union Types”. in: 11th International Workshop on Logical Frameworks and Meta-languages, LFMTP. Oxford, United Kingdom, Sept. 2017, 1 –9. url: [SW11]
B. Spitters and E. van der Weegen. “Type Classes for Mathematics in Type Theory”. in: CoRR abs/1102.1323 (2011). arXiv: [Uni13]
T. Univalent Foundations Program. Homotopy Type Theory: Univalent Foundations of Mathematics. Institute for Advanced Study: [VJS]
vis.js - A dynamic, browser based visualization library. url: [Wat+14]
S. Watt, J. Davenport, A. Sexton, P. Sojka, and J. Urban, eds.. Intelligent Computer Mathematics. Conferences on Intelligent Computer Mathematics (Coimbra, Portugal, July 7–11, 2014). LNCS 8543. Springer, 2014.[Wie06]
F. Wiedijk. The Seventeen Provers of the World. Springer, 2006.[Wie92]
G. Wiederhold. “Mediators in the architecture of future information systems”. in: Computer 25.3 (1992), 38–49.[WKR17]
T. Wiesing, M. Kohlhase, and F. Rabe. “Virtual Theories – A Uniform Interface to Mathematical Knowledge Bases”. in: MACIS 2017: Seventh International Conference on Mathematical Aspects of Computer and Information Sciences. ed. by J. Blömer, T. Kutsia, and D. Simos. LNCS 10693. Springer Verlag, 2017, 243–257. url: [WR13]
A. Whitehead and B. Russell. Principia Mathematica. Cambridge University Press, 1913.












